Guidelines

This site is for tech Q&A. Please keep your posts focused on the subject at hand.

Ask one question at a time. Don't conflate multiple problems into a single question.

Make sure to include all relevant information in your posts. Try to avoid linking to external sites.

Links to documentation are fine, but in addition you should also quote the relevant parts in your posts.

0 votes
668 views
668 views

I'm moving a part of my old database table to a new database (Question2Answer CMS). In another words, I'm copying some questions to a new Q2A site.
The ^posts table contains a column named userid in INTEGER format. Its current values ranges from 2 to about 600.
For some reasons, I would like to make a change purposely. I would like to update those values in userid column to take random values from 2500 to say 3500.

Before:
userid
2
15
376
114
..
After:
userid
2734
3002
2678
2905

How should I do?

in Scripting
by (50)
1 5
edit history

Please log in or register to answer this question.

1 Answer

0 votes
 

The simplest approach is probably to just add 2500 to the existing IDs, since the target range (2500..3500) doesn't overlap with the existing range (2..600).

UPDATE qa_posts SET userid = userid+2500;

You could in theory set a random value between 2500 and 3500 like this:

UPDATE qa_posts SET userid = FLOOR(RAND() * 1001 + 2500) WHERE postid=42;

But you would need to do it for each post separately, or all posts would get assigned the same random value. Even if you do it separately for each post there will likely be collisions, i.e. some rows getting the same (random) value.

With that said, I don't think naïve approaches like the ones outlined above will work at all. The userid column in the table qa_posts is a foreign key, i.e. it references the column userid in the table qa_users, so you can't just insert arbitrary numbers. The IDs you insert must already exist in the table qa_users otherwise you'll be getting an error.

Updating a column in one table with random values from another table isn't all that simple, though. The SQL statements below would select the rows from qa_users in random order and then update the userid column of the qa_posts table with the user ID from the corresponding row in that randomized user list. However, that limits the update to the minimum of user count and post count, meaning that if you have 5 users and 10 posts, only the first 5 posts will get the userid changed.

SET @row1=0, @row2=0;
UPDATE qa_posts p JOIN (
  SELECT t1.postid, t2.userid
  FROM (
    SELECT postid, @row1:=@row1+1 AS row
    FROM qa_posts
  ) AS t1 INNER JOIN (
    SELECT userid, @row2:=@row2+1 AS row
    FROM (
      SELECT userid
      FROM qa_users
      ORDER BY RAND()
    ) AS t
  ) AS t2 ON t1.row = t2.row
) AS u ON p.postid = u.postid
SET p.userid = u.userid;

It might be easier to fetch the user and post IDs with an external script and generate a series of UPDATE (or INSERT) statements from them, particularly if it's a migration task that will be run only a limited number of times. In Bash you could for instance do something like this:

#!/bin/bash

dbname='your_database'

readarray -t userids < <(mysql -u root -BN -e "SELECT userid FROM qa_users" "$dbname")
readarray -t postids < <(mysql -u root -BN -e "SELECT postid FROM qa_posts" "$dbname")

usercount="${#userids[@]}"
for id in "${postids[@]}"; do
  rnd="$((RANDOM % usercount))"
  echo "UPDATE qa_posts SET userid=\"${userids[$rnd]}\" WHERE postid=\"${id}\";"
done | mysql -u root "$dbname"
by (115)
2 19 33
edit history
 
I didn't think of the overlapping issue before asking. Your simple trick to add 2500 is cool. Thanks.
The reason I wanted to change from 2-600  to 2500-3500 is that  there are users in the ranges 2500-3500 in qa_users, but there is no userid in 2-600 range (as a result of restructuring data). Those are dummy accounts to ask and answer questions if necessary. So it doesn't matter who they are.
 
I think I can perform the opposite solution too, like changing some userids in qa_users to fill up the missing range of 2-600.
...