# Mysql - comparing and combining table rows



## markx (Dec 22, 2007)

Anyone in this corner of the internet au fait with mysql and PHP? You wouldn't need to know much to know more than I do but I'm sick of looking for the answer to this one...

I want to compare all the rows of a single mysql table, combine the totals of any duplicate rows into a single row (excluding the auto_increment column) and then delete the duplicates. Well, it's slightly more complicated than that but there's no point posting all the details if I'm talking to myself. :lol


----------



## MouseWithoutKeyboard (Oct 18, 2007)

It depends on what is the duplicated column is. Is it a username column? Are you comparing the WHOLE row and not a specific column? If you're comparing the whole row, you'd need to use CURSORS (you could do it set-based but it would be very costly in performance).

Infact, if you're comparing the WHOLE row, just do it server-side using PHP rather than at a DB level.

If it's just a column, you can do...(BACKUP your db before running the command)
DELETE table1 WHERE id IN (SELECT DISTINCT(username) FROM table1)

I think the query analyzer compiles the "DISTINCT" query into a "group by" query when it runs so you could use the group by clause as well (or a operator like =)


----------



## markx (Dec 22, 2007)

Lol, I must warn you that I only started messing around with Mysql last weekend. 

It's a table with an "id" (auto_increment, primary key), "product_name", "item_number", "price" (decimal) and "quantity". They're all NOT NULL except for "item_number". What I'm trying to work out is how to check for rows where the "product_name", "item_number" and "price" are the same and rows where "product_name" and "price" are the same. In other words, "widget, #1234, 10.00" should not be flagged as being a duplicate with "widget, 10.00". If a duplicate row is found I want to add the "quantity" values, update that value in the _other_ row and then delete the duplicate row.

Talk about biting off more than you can chew... :um


----------



## MouseWithoutKeyboard (Oct 18, 2007)

I hate mysql vs tsql

tempTable holds the IDs and QUANTITY values of the UNIQUE ROWS (they have the updated quantities)
tempTable3 holds the IDs that ARE UNIQUE
tempTable2 holds the ids that are NOT unique (duplicate rows)

According to SQL standards, you could use just one temp table for this BUT mysql doesn't have that feature built in so you have to use multiple temp tables (one to store uniques ids, one to store not unique ids, one to store quantity and unique ids)

You have to run this as a STORED PROCEDURE: 

CREATE TEMPORARY TABLE tempTable (id INT UNSIGNED NOT NULL DEFAULT 0, quantity INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;
CREATE TEMPORARY TABLE tempTable3 (id INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;
CREATE TEMPORARY TABLE tempTable2 (id INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;

INSERT INTO tempTable (id, quantity) SELECT id, quantity FROM (SELECT id, SUM(quantity) as quantity from table1 as tbl2
GROUP BY tbl2.product_name, tbl2.item_number, tbl2.price, tbl2.quantity) as tblWithNewQuantity;

INSERT INTO tempTable3 (id) SELECT id FROM (SELECT id, SUM(quantity) as quantity from table1 as tbl2
GROUP BY tbl2.product_name, tbl2.item_number, tbl2.price, tbl2.quantity) as tblWithNewQuantity;

INSERT INTO tempTable2 (id) SELECT id FROM table1 WHERE id NOT IN (SELECT id FROM (SELECT id, SUM(quantity) as quantity from table1 as tbl2
GROUP BY tbl2.product_name, tbl2.item_number, tbl2.price, tbl2.quantity) as tblWithNewQuantity);

UPDATE table1 SET quantity = (SELECT quantity FROM tempTable WHERE tempTable.id = table1.id LIMIT 1) WHERE id IN (SELECT id FROM tempTable3);

DELETE FROM table1 WHERE id IN (SELECT id FROM tempTable2);

DROP TABLE tempTable;
DROP TABLE tempTable3;
DROP TABLE tempTable2;


Use the following query to test it out: (this will delete everything in the table and refresh it with test data)
DELETE FROM table1;
INSERT INTO table1 (product_name, item_number, price, quantity) VALUES ('product 1', NULL, 3, 1);
INSERT INTO table1 (product_name, item_number, price, quantity) VALUES ('product 1', NULL, 3, 1);
INSERT INTO table1 (product_name, item_number, price, quantity) VALUES ('product 1', NULL, 3, 1);
INSERT INTO table1 (product_name, item_number, price, quantity) VALUES('product 1', 'lol', 3, 1);


----------



## batman can (Apr 18, 2007)

Grrrrrrrr I hate Mysql.


----------



## markx (Dec 22, 2007)

Oh wow, I thought that it might be quite involved but I wasn't expecting it to be quite _that_ complex. Thanks a lot for that, I'll give a try and see what happens.


----------



## markx (Dec 22, 2007)

Ok, after several hours of searching and reading until my eyes crossed, I've finally given in! :lol How exactly do I set up and use a stored procedure?


----------



## MouseWithoutKeyboard (Oct 18, 2007)

markx said:


> Ok, after several hours of searching and reading until my eyes crossed, I've finally given in! :lol How exactly do I set up and use a stored procedure?


Use the query browser to create one. You could use the command prompt to create one BUT it is VERY tedious and unprofessional.

Basically, download the query browser from http://dev.mysql.com/downloads/gui-tools/5.0.html (It's called MySql GUI). Install it, run query browser, connect to your MySql server, right click the "database" you want and click "Create Stored Procedure" and paste the first section of the code between "BEGIN" and "END".

You should check out some of the tutorials available from MySql.


----------



## markx (Dec 22, 2007)

OK, after a lot of messing around I think I'm finally getting a little bit closer. Now I'm using a different server which has MySQL 5 (Godaddy :roll) and using MySQL Stored Procedure Editor as Godaddy don't allow Remote Access on shared servers. The procedure has been accepted (I didn't add anything in the "parameters" field - not sure if that's important) and now I'm just left scratching my head trying to work out how to call the procedure and/or use the query. :con


----------



## MouseWithoutKeyboard (Oct 18, 2007)

Use php to call it? I don't know if your hosting server has any "query tools" that you could run.
<?php
mysql_connect("localhost", "admin", "pass") or die(mysql_error()); <---- connects to the mysql db
mysql_select_db("test") or die(mysql_error()); <---- selects database
$result = mysql_query("CALL procedure_name") or die(mysql_error()); <---- executes the query
mysql_close() <---- closes the connection
?>

If you have some sort of query tool, you can just do "CALL procedure_name"

BTW, the procedure above does not require any parameters.


----------



## markx (Dec 22, 2007)

Thanks, I was trying to make it more complicated than that. Now, even after my latest "d'oh!" moment, I could still be on the wrong track but I'm wondering if the stored procedure is comparing the contents of the quantity column rather than the price column? What appears to be happening now is that rows are only being combined when their quantities are the same.







I tried to edit the stored procedure to test that theory but it wasn't a pretty sight...


----------



## MouseWithoutKeyboard (Oct 18, 2007)

markx said:


> Thanks, I was trying to make it more complicated than that. Now, even after my latest "d'oh!" moment, I could still be on the wrong track but I'm wondering if the stored procedure is comparing the contents of the quantity column rather than the price column? What appears to be happening now is that rows are only being combined when their quantities are the same.
> 
> 
> 
> ...


Oops, here's the fixed version: (Previously, it was comparing both the price and quantity columns)


> CREATE TEMPORARY TABLE tempTable (id INT UNSIGNED NOT NULL DEFAULT 0, quantity INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;
> CREATE TEMPORARY TABLE tempTable3 (id INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;
> CREATE TEMPORARY TABLE tempTable2 (id INT UNSIGNED NOT NULL DEFAULT 0) ENGINE=MEMORY;
> 
> ...


Notice that I took out the "quantity" column in the group by clause.


----------



## markx (Dec 22, 2007)

That's it, perfect! Thanks a lot for that, Charles, it would have taken me 10 years to work out how to do that all by myself.


----------

