How to make multiple updates using a single query in MySQL

As you might know it’s quite easy to make multiple INSERTs in a single query, like this:

INSERT INTO mytable
(id, title)
VALUES
('1', 'Lord of the Rings'),
('2', 'Harry Potter');

However, for some strange reason you can’t do multiple changes to a table in a single UPDATE query like this:

UPDATE mytable
SET (title='Great Expectations' WHERE id='1'),
(title='War and Peace' WHERE id='2');


However, you can do a very interesting trick. You can combine an UPDATE with a CASE like this:

UPDATE mytable SET title = CASE
WHEN id = 1 THEN 'Great Expectations';
WHEN id = 2 THEN 'War and Peace';
ELSE title
END;

The ELSE title is very important, otherwise you will overwrite the rest of the table with NULL.

You might wonder why on earth you’d want to make multiple updates in a single query. Here’s an example that might not be so obvious:

Let’s say you want to make a page view counter for your shop, but you want to implement caching for your pages and running an SQL UPDATE query for each page is out of the question. An efficient solution would be to make a logfile with each view as a new line appended in a file.

Here’s an example of logfile, with each view logged on a single line using this format: IP | PRODUCT_ID

78.32.43.2|3
54.133.87.54|2
85.83.93.91|4

The append part is most important, since it’s the only file writing mode that can be used with multiple processes (remember that you have MANY viewers at the same time, and if only 2 of them try to write a file at the same time your file will be pretty much screwed – no, you don’t lose the file itself, you just lose the content, which is even more important).

An alternative would be to use file locking, but it’s really really inefficient when you have lots of visitors (digg effect anyone?). However, the append mode is optimized by the operating system and stores the content in a buffer, making sure the content is write properly.

So, the best solution would be to make a logfile and then process this file with a cron job. It’s very easy to do that, but the challenge lies in updating the database. Here’s where the multiple updates in a single query trick comes into play.

You can just create a long query to update the database and run it only once instead of hundreds of small queries (which in case you didn’t figure it out, would bring your database to its knees in many cases).

So we can make a script to parse our logfile like this:

$logfile = 'logfile.txt';

$views = array();
$ip_db = array();

$lines = file($logfile);

$cnt = count($lines);
for ($i=0; $i<$cnt; $i++)
{
	if (preg_match('/([0-9.]+)\|([0-9]+)/', $lines[$i], $regs))
	{
		$ip = $regs[1];
		$id = $regs[2];

		if (!isset($ip_db[$ip]))
		{
			$ip_db[$ip] = 1;
			$views[$id]++;
		}
	}
}

if (empty($views))
{
	exit;
}

$query = "UPDATE products SET views = CASE ";

$idlist = '';

reset($views);
while (list($id, $val) = each($views))
{
	$query .= " WHEN id = ".$id." THEN views + ".$val." ";
	$idlist .= $id.',';
}

$idlist = substr($idlist, 0, -1);

$query .= "
END
WHERE id IN (".$idlist.")";

// run $query

Simple and efficient. And did I mention it’s also free?

Download link: parse_logfile.php

Important update:

Just as Jesper pointed out using transactions instead of CASE can make a huge difference.

Here are the results of some tests made on a 316,878 records database using both MyISAM and InnoDB storage engine.

InnoDB

Rows updated Transactions time
(in seconds)
CASE time
(in seconds)
400 6s 11s
1000 20s 17s
30000 (too long) (too long)

MyISAM

Rows updated Consequent queries time
(in seconds)
CASE time
(in seconds)
400 0s 6s
1000 0s 13s
30000 10 (too long)

As you can see the results are very interesting. You can clearly see the difference of both storage engine and transactions (at least for MyISAM). In other words, if you use MyISAM (and many people do), use transactions for multiple updates. If you use InnoDB switch to MyISAM and them use transactions.

The CASE method seems to be efficient only for few updates (so you can make a cron job to run updates more frequently), but overall you can see clearly that you can still use transactions and get better results.

I hope the method described in this post helped you or at least gave you more ideas on how to use it for any of your projects, and if you didn’t know about transactions, maybe you should give it more attention (I sure will from now on).

Second (and hopefully final) update:

After a bit of research I figured out MyISAM doesn’t support transactions (yes, silly me), so the tests above were done using simple consequent queries (modified up there as well). However, it seems MySQL does some internal optimizations and runs them very efficiently. If you have a busy database it’s a good idea to do a LOCK TABLE query before the batch update though (and of course don’t forget to UNLOCK the table when done).

Also after a few suggestions I got to a much faster version (read the comments below – Thanks, guys).

So here’s the final CASE version:

UPDATE mytable SET title = CASE
WHEN id = 1 THEN ‘Great Expectations’
WHEN id = 2 THEN ‘War and Peace’
...
END
WHERE id IN (1,2,...)

Although it doesn’t beat the speed of normal queries run in a row, it can still get close enough.

Also, since many people asked why bother that much when you can just run the normal queries, you can consider this an experiment to find alternatives. Also, keep in mind that in order to gain advantage of MySQL optimizations all those UPDATE queries need to be run in a batch. Running each UPDATE query when the page is viewed for example is not the same thing (and that’s pretty much the problem I wanted to solve as efficiently as possible).

Also, as a final note, you should always run an OPTIMIZE TABLE query after all these updates.

35 thoughts on “How to make multiple updates using a single query in MySQL

  1. Your ‘CASE’ example looks really inefficient to me. It should say WHERE id=’1′ OR id=’2′, to avoid scanning the whole table.

  2. As a development DBA, I would never let you run this SQL against my database. You are touching every row in the table. Potentially millions of rows. You are updating values that might be indexed, potentially causing indexes to be updated. You are acquiring write locks on all those rows. You are causing the old version of the row to be copied aside, to be used for query snapshot consistency (read committed transaction isolation level). You are causing a lot of log IO (and then archive log IO, and replication log IO) that is going to slow down the server.

    Basically, unless you begin by saying: “this is a very special case that is used only on this tiny table that is guaranteed to never exceed 5 rows because _good reason_”, I will have to make sure this code doesn’t get pushed into QA env. until it’s reworked.

    You have the ids when you’re generating this. Use the WHERE, Luke!

  3. The WHERE id=’1′ OR id=’2′ thing works only if you want to update multiple rows with the SAME value. Usually that’s not the case.

    As for the transaction, it can be an idea but it still runs N queries. The whole idea was to run a SINGLE query.

    Indeed, it does lock the whole table when it’s run and it does verify ALL rows, but I don’t see any problem (unless there’s a database crash or something critical). Also it DOESN’T affect all rows since MySQL is smart enough not to waste time and change a value with itself. You can run it in phpMyAdmin or run a mysql_affected_rows after it, you’ll see it affects only the rows that need to be updated.

    Also the speed it pretty good, I still need to test it on a huge table, but for my example a products table isn’t necessarily huge (on average I’d say 1000-10000 rows), so it should be quite efficient in the end.

    Please note, if you want to change all rows with the SAME value, it is indeed much more efficient to just use something like UPDATE mytable SET value = value + 10 WHERE id=10 OR id=12, however, this simply doesn’t work if you want to change each row with a different value.

    As for other RDBMS it should work as well, but you might need to make minor syntax adjustments (of course you won’t be able to know the answer for sure until you try it, I personally only use MySQL so I simply didn’t bother).

  4. I have to disagree with you there. The problem with your case is that it will in fact traverse every single row and compare the id to your CASE.

    The point of having WHERE is that you give your DBMS a chance to use indexes on its own. If you have an index on ‘id’, which I’m sure you do, you can do a much faster table lookup. If your table type supports it, you will only also lock that one row, and not the entire table.

    You argue that the reason for doing this in a single query is that you won’t lose “consistency” across queries as you have many visitors, but this is *exactly* what transactions are meant for. Basically you group queries into a “block” that won’t execute until you COMMIT. This is *much* better than using a CASE. I doubt it will work properly with other databases as well.

    I’ll give you some credit for thinking this up in the first place, but please don’t delude yourself into thinking this hasn’t been solved before 🙂

  5. Another thing: MySQL is “smart” enough not to replace a value with itself — sure, but this implies that it has to actually compare the first value to the second, which is a fairly expensive operation on a large scale.

  6. Why bother making a efficient Stored Procedure, when you can write the entire table…

    (terrible hack)

  7. “The CASE method seems to be efficient only for few updates”

    False! Please try to get this right. Your initial example was not ineffcient because you used CASE. It was inefficient because you DIDN’T use WHERE.

  8. You are correct. Although it doesn’t show any errors or warnings when you try to use transactions, MyISAM doesn’t use transactions.

    However, when you run a batch of UPDATE queries MySQL optimizes the result, so the fastest time was actually the time for normal queries.

    Also, after a few more tests I didn’t get any improvement by using LOCK TABLE before the batch. You might get better results if you have lots of READ queries at the same time though.

  9. Best of both worlds:

    UPDATE mytable SET title = CASE
    WHEN id = 1 THEN ‘Great Expectations’;
    WHEN id = 2 THEN ‘War and Peace’;
    ELSE title
    END
    WHERE id=1 OR id=2

  10. Actually if you have many items it would be more efficient to use something like:

    UPDATE mytable SET title = CASE
    WHEN id = 1 THEN ‘Great Expectations’;
    WHEN id = 2 THEN ‘War and Peace’;

    ELSE title
    END
    WHERE id IN (1,2, …)

  11. UPDATE mytable SET title = CASE
    WHEN id = 1 THEN ‘Great Expectations’;
    WHEN id = 2 THEN ‘War and Peace’;
    END
    WHERE id=1 OR id=2

    You do not need ‘ELSE’ at all

  12. Indeed, after optimizing the query with WHERE and getting rid of ELSE, I managed to get decent results with CASE as well.

    In particular for 20.000 updates on the same table I got these results:

    Time spent with 20000 queries in a row: 4 seconds

    Time spent with a single query using CASE: 25 seconds

    So yeah, it’s an alternative, just not a very good one in many cases.

  13. Hi,
    This is NKarthiKeswarN,
    Advance thanks to all…
    I want to know
    How to insert the datas in multiple table into a
    single query by using stored procedures…

  14. I am a beginneers level in dotnet and sql…
    Pls guide me…
    I want to know from the skratch about that…
    Advance thanks to all..
    I ready to dedicated myself…Guide me..

  15. Did you consider trying to insert each row and using “ON DUPLICATE UPDATE” to update the records instead? ie

    INSERT INTO test_table
    (sid, staffid, name)
    VALUES
    (1, ‘ar’, ‘Andrew 2’),
    (1, ‘jm’, ‘John 2’),
    (1, ‘sl’, ‘Seb 2’)
    ON DUPLICATE KEY UPDATE test_table.name = VALUES(name);

  16. for me it worked like a charm!
    I have 1.6mln rows table. Single update (check against 7 column values) happens in 18.2secs, while 11 in-case updates happens in 10 secs! awesome!

    UPDATE prices SET offer_value = CASE
    WHEN date=’2009-06-01′ and period=’7′ and description=’red’ and level=’Standart’ THEN 1000
    WHEN date=’2009-06-01′ and period=’7′ and description=’blue’ and level=’Standart’ THEN 1100
    WHEN date=’2009-06-01′ and period=’7′ and description=’brown’ and level=’Standart’ THEN 1200
    WHEN date=’2009-06-01′ and period=’7′ and description=’grey’ and level=’Standart’ THEN 1300
    WHEN date=’2009-06-01′ and period=’7′ and description=’greed’ and level=’Standart’ THEN 1400
    ELSE offer_value
    END
    WHERE indentif IN (3957230004)

  17. Could you try combining LOAD DATA INFILE with ON DUPLICATE KEY UPDATE. LOAD DATA INFILE is super fast so this might be the quickest way to update the database? Ex:

    LOAD DATA INFILE ‘myfile’ INTO TABLE ‘mytable’
    (unique_id, @var1, col2, @var2, col3, @var3 …)
    ON DUPLICATE KEY UPDATE (co2, co3, …) VALUES (col2, col3 …)

    #sorry my syntax might be off some.

  18. This is a good idea for smaller tables, and for projects that shouldn’t require a stored-procedure. Some projects I get the client won’t even give me DB access, just the site’s specific table and creds… which is limited to your basic Select, Update, Delete, Insert permissions.

    I’m using it to update the order preferences on a table of <2000 records and it does the job perfectly.

    Obviously if your table has more than a couple thousand records, a SPROC would be the best solution, but that also means it a big project and you most likely have all the resources you need. This was obviously built on the mindset of limited db ability.

    I personally like it because now I don't have to see 2000 queries be looped through to update said records, this query allows one good sweep and change. IMHO Good JOb!

  19. How about a temp table:

    CREATE TABLE tmp( id INT(8), title VARCHAR(32) );

    INSERT INTO tmp VALUES (1, ‘Great Expectations’),
    (2, ‘War and Peace’);

    UPDATE mytable,tmp SET mytable.title = tmp.title WHERE tmp.id = mytable.id;

    DROP TABLE tmp;

    The advantage of this is that it’s very easy to update multiple columns, e.g.

    SET mytable.col1 = tmp.col1, mytable.col2 = tmp.col2
    WHERE tmp.id = mytable.id

    you just have to make sure that the tmp table fields are the same (i.e. compatible) with the ‘mytable’ fields

  20. Oh to be a little more precise, i used the following syntax :
    UPDATE mytable SET title = CASE
    WHEN id = 1 THEN ‘Great Expectations’;
    WHEN id = 2 THEN ‘War and Peace’;

    END
    WHERE id IN (1,2, …)
    It was a pain in the ass to code through foreach and all but the result worthes it.

    Thx again ! And sorry for my english, im not a native…

  21. The Zune concentrates on being a Portable Media Player. Not really a internet browser. Not really a game machine. Maybe in the future it’ll do even better in those areas, but for now it is a easy way organize and listen to your music and videos, and is without peer and need rest. The iPod’s strengths are its web browsing and apps. If those sound more compelling, perhaps it is your best choice.

  22. The case in conjunction with the where is a brillant idea, but does not work, if you use compound attributes as keys. But, yeesssssss, I was desperately searching for the referencing method – VALUES(name) -of buggedcom’s statement, which does the trick:

    INSERT…
    ON DUPLICATE KEY UPDATE test_table.name = VALUES(name);

    Well done, Sir.

  23. I like doing this better instead of doing multiple updates in 1 query, for the sake of convenience and KISS principle:

    START TRANSACTION;
    UPDATE X1 SET ID=1 WHERE ID = Y1;
    UPDATE X2 SET ID=2 WHERE ID = Y2;
    UPDATE X3 SET ID=3 WHERE ID = Y3;
    COMMIT;

  24. @Gilbert, while that is a tried-and-true way of getting the same result… the truth is Transactions can only be done on tables with the InnoDB engine. Truth is most are running MyISAM which is incapable of running transactions, and InnoDB is yet to be the default engine… but if Oracle has their way it soon will be in next-gen version of MySQL.

    Again, the point of the query in the original article is to make 1 query to do multiple updates to save and bandwidth and sanity on a limited MySQL enabled web-host. It’s quite obvious that was it’s initial intentions.

  25. Pingback: web hosting europe
  26. Hello dear remove : from sql like this
    UPDATE mytable SET title = CASE
    WHEN id = 1 THEN ‘Great Expectations’
    WHEN id = 2 THEN ‘War and Peace’
    ELSE title
    END;

    I have tested this

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload CAPTCHA.