Extracting Key-worlds out of string and show them in another column

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP


Extracting Key-worlds out of string and show them in another column



I need to write a query to extract specific names out of String and have them show in another column for example a column has this field



Column:
Row 1: jasdhj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj
Row 2:jasnasc8918212,ahsahkdjjMina67,
Row 3:kasdhakshd,asda,asdasd,121,121,Sina878788kasas



Key Words: Amin,Mina,Sina



How could I have these key words in another column? I dont want to insert another column but if that's the only solution let me know.
Any help appreciated!





You should improve the formatting of you row data. It's currently quite hard to understand.
– Gregor Koukkoullis
2 days ago





what the logic for extracting those keywords? do you have those keywords list pre-defined and just need to detect which ones present in which row?
– Mikhail Berlyant
2 days ago




2 Answers
2



Below is for BigQuery Standard SQL


#standardSQL
WITH keywords AS (
SELECT keyword
FROM UNNEST(SPLIT('Amin,Mina,Sina')) keyword
)
SELECT str, STRING_AGG(keyword) keywords_in_str
FROM `project.dataset.table`
CROSS JOIN keywords
WHERE REGEXP_CONTAINS(str, CONCAT(r'(?i)', keyword))
GROUP BY str



You can test, play with above using dummy data from your question as below


#standardSQL
WITH `project.dataset.table` AS (
SELECT 'jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj' str UNION ALL
SELECT 'jasnasc8918212,ahsahkdjjMina67,' UNION ALL
SELECT 'kasdhakshd,asda,asdasd,121,121,Sina878788kasas'
), keywords AS (
SELECT keyword
FROM UNNEST(SPLIT('Amin,Mina,Sina')) keyword
)
SELECT str, STRING_AGG(keyword) keywords_in_str
FROM `project.dataset.table`
CROSS JOIN keywords
WHERE REGEXP_CONTAINS(str, CONCAT(r'(?i)', keyword))
GROUP BY str



with results as


Row str keywords_in_str
1 jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj Amin,Mina
2 jasnasc8918212,ahsahkdjjMina67, Mina
3 kasdhakshd,asda,asdasd,121,121,Sina878788kasas Sina





But the column has hundreds of row and I can't add all the row in select. So is there anyway I can write it to look in each row instead of adding each string in SELECT 'jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj' str UNION ALL SELECT 'jasnasc8918212,ahsahkdjjMina67,' UNION ALL SELECT 'kasdhakshd,asda,asdasd,121,121,Sina878788kasas'
– Amin
2 days ago







see update. so hopefully it is more clear for you now how to use answer. just replace project.dataset.table with your real table reference
– Mikhail Berlyant
2 days ago




project.dataset.table





as FYI: you can mark accepted answer by using the tick on the left of the posted answer, below the voting. see meta.stackexchange.com/questions/5234/… for why it is important.
– Mikhail Berlyant
2 days ago





It works thank you so much
– Amin
2 days ago





when I'm adding my criteria in this query I'm getting the error that Unrecognized name: str at [8:23]. I dont know whats the problem with the query. could you please advise! WITH keywords AS ( SELECT keyword FROM UNNEST(SPLIT('a,b,s')) keyword ) SELECT STRING_AGG(keyword) keywords_in_str FROM xxx CROSS JOIN keywords WHERE REGEXP_CONTAINS(keyword, CONCAT(r'(?i)')) GROUP BY str
– Amin
yesterday




xxx



to count the no of keywords


#standardSQL
WITH `project.dataset.table` AS (
SELECT 'jasdhMINAj31e31jh123hkkj,12l1,3jjds,Amin,02323rdcsnj' str UNION ALL
SELECT 'jasnasc8918212,ahsahkdjjMina67,' UNION ALL
SELECT 'kasdhakshd,asda,asdasd,121,121,Sina878788kasas'
)
select str,array(select as struct countif(lower(x) ="amin") amin,countif(lower(x) ="mina") mina,countif(lower(x)="sina") sina from unnest(x)x)keyword from
(select str,regexp_extract_all(str,"(?i)(Amin|Mina|Sina)")x from `project.dataset.table`)






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Keycloak server returning user_not_found error when user is already imported with LDAP

PHP parse/syntax errors; and how to solve them?

How to scale/resize CVPixelBufferRef in objective C, iOS