Guidelines

This site is for tech Q&A. Please keep your posts focused on the subject at hand.

Ask one question at a time. Don't conflate multiple problems into a single question.

Make sure to include all relevant information in your posts. Try to avoid linking to external sites.

Links to documentation are fine, but in addition you should also quote the relevant parts in your posts.

1 vote
3.0k views
3.0k views

I would like to create related items using tags. The simplified database structure is somewhat like this

| postid | title              | tags                        |
| 1      | Neverwinter Nights | game,adventure,role-playing |
| 2      | Genshin Impact     | game,turn-based             |
| 3      | League of Legends  | game,moba,esports,pc        |
| 4      | PUBG               | esports                     |

Articles are generated from this (simplified) table.

An example article will have a $tags variable like this

$tags = 'esports,auto-battle';

So, how do I create a MySQL query that can match one of the tags with one of the value in tags column in the database?

Like in this example, the desired result will be the record #3 and #4.

SELECT postid, title FROM e_table WHERE ???

Edit: I'm having some clues. Will this work or is it syntactically correct?

$tags = 'esports,auto-battle';
$tags = preg_replace(",", "|", $tags);
SELECT postid, title FROM e_table WHERE `tags` REGEXP '$tags';
in Scripting
edited by
by (50)
1 4
edit history
 
Please edit your question if you need to provide additional information.

Please log in or register to answer this question.

1 Answer

1 vote

To answer the immediate question: yes, the regular expression approach should work for your described scenario.

With that said, I do not recommend going that route, because the general design it's based on is not very good. You will likely run into issues with different queries (like "give me the games matching all of these tags"), and sanitizing a tag list that is kept as a single comma-separated string may cause additional complications.

A better approach would be normalizing the database table and keeping the tag list in your PHP application as an array that you only merge into a string when you actually run a database query.

A normalized table structure would look like this:

Table games:

| id | title              |
+----+--------------------+
| 1  | Neverwinter Nights |
| 2  | Genshin Impact     |
| 3  | League of Legends  |
| 4  | PUBG               |

Table tags:

| id | title        |
+----+--------------+
| 1  | game         |
| 2  | adventure    |
| 3  | role-playing |
| 4  | turn-based   |
| 5  | moba         |
| 6  | esports      |
| 7  | pc           |

Table mapping:

| gameid | tagid |
+--------+-------+
| 1      | 1     |
| 1      | 2     |
| 1      | 3     |
| 2      | 1     |
| 2      | 4     |
| 3      | 1     |
| 3      | 5     |
| 3      | 6     |
| 3      | 7     |
| 4      | 6     |

SQL:

CREATE TABLE games (
  id    int          NOT NULL AUTO_INCREMENT PRIMARY KEY,
  title varchar(255) NOT NULL
);
CREATE TABLE tags (
  id    int          NOT NULL AUTO_INCREMENT PRIMARY KEY,
  title varchar(255) NOT NULL
);
CREATE TABLE mapping (
  gameid int NOT NULL,
  tagid  int NOT NULL,
  PRIMARY KEY (gameid, tagid)
);

Your base query would then map games and tags like this:

SELECT g.id
  , g.title AS 'game'
  , t.title AS 'tag'
FROM games g
  INNER JOIN mapping m ON g.id = m.gameid
  INNER JOIN tags t ON m.tagid = t.id;

which would give a result like this (the tag column is included only for demonstration purposes and will be omitted in the other examples below):

| id | game               | tag          |
+----+--------------------+--------------+
| 1  | Neverwinter Nights | game         |
| 1  | Neverwinter Nights | adventure    |
| 1  | Neverwinter Nights | role-playing |
| 2  | Genshin Impact     | game         |
| 2  | Genshin Impact     | turn-based   |
| 3  | League of Legends  | game         |
| 3  | League of Legends  | moba         |
| 3  | League of Legends  | esports      |
| 3  | League of Legends  | pc           |
| 4  | PUBG               | esports      |

That'll allow you to construct the selection clauses of your query whichever way you need.

For instance, fetching the games that match any of the given tags would be done with this SQL query:

SELECT g.id, g.title AS 'game'
FROM games g
  INNER JOIN mapping m ON g.id = m.gameid
  INNER JOIN tags t ON m.tagid = t.id
WHERE t.title IN ('esports','auto-battle');

In PHP:

$tags  = Array('esports', 'auto-battle');
$query = "SELECT g.id, g.title AS 'game'" .
  " FROM games g INNER JOIN mapping m ON g.id = m.gameid INNER JOIN tags t ON m.tagid = t.id" .
  " WHERE t.title IN ('" . implode("','", $tags) . "');";

Whereas fetching the games that match all of the given tags would be done with an SQL query like this:

SELECT g.id, g.title AS 'game'
FROM games g
  INNER JOIN mapping m ON g.id = m.gameid
  INNER JOIN tags t ON m.tagid = t.id
GROUP BY g.title
HAVING COUNT(CASE WHEN t.title IN ('esports','auto-battle') THEN 1 END) = 2;

In PHP:

$tags  = Array('esports', 'auto-battle');
$query = "SELECT g.id, g.title AS 'game'" .
  " FROM games g INNER JOIN mapping m ON g.id = m.gameid INNER JOIN tags t ON m.tagid = t.id" .
  " GROUP BY t.title HAVING COUNT(CASE WHEN t.title IN ('" . implode("','", $tags) . "') THEN 1 END) = " . count($tags) . ";";

Note that all of these examples assume that $tags contains an array of unique, properly sanitized strings. That is something you must ensure first.

Also note that when imploding the array you must do it in a way that adds quotes around the strings (as in my examples). Never use bare strings in SQL queries.

by (100)
1 13 28
edit history
 
Thanks for reminding me of script or sql injection. The good news is that this table is not user-generated.

I still have some doubts. When I display an article, I need to display the tags too. If I divided into 3 tables. The variable $tags is hard to generate. I mean, first, I need to pull the data from the main table to display an article. And then I used the $tags variable to create related posts. But how can I get this variable? Maybe the structure of this variable is different. It's not a comma-separated string any more.

Edit: To correct, I don't need to display tags  on the article. Tags are used to sort or filter only, they are hidden from normal users' view. And I don't insist using a comma-separated string. So, if we use 3 tables. It's not necessary to create a variable that is a comma-separated string. Maybe some ways to directly get the "related" data.
 
I think If I create 3 tables, I can get the 'related' data by doing a 2-step query on the mapping table (or two separate queries) using the $gameid variable. There's no need to generate the $tags variable.

Thank you. I don't want to bother you more.
 
Hmm... you could define a view or materialized view using the base query from my answer. That would allow you to work with the data as if it were just one table.
 
"Thank you. I don't want to bother you more."

You're welcome, and don't worry about it. I wouldn't be running this forum or posting answers if I didn't want to.
...