2013/07/23

SQL stuff

I am just a beginner in SQL, so I write some trivial stuff here. Lets say I have some datum in my table of logs:
SELECT *, stime - time AS delay FROM logs;
+------+------+-------+-------+
| usid | time | stime | delay |
+------+------+-------+-------+
|    1 |   10 |    10 |     0 |
|    1 |   10 |    12 |     2 |
|    1 |   15 |    17 |     2 |
|    1 |   13 |    15 |     2 |
|    1 |   18 |    21 |     3 |
|    1 |   19 |    22 |     3 |
|    1 |   20 |    25 |     5 |
|    1 |   21 |    26 |     5 |
|    1 |   22 |    26 |     4 |
|    1 |   23 |    26 |     3 |
+------+------+-------+-------+
Where:
  • usid is some random id
  • time is record time in the client side
  • stime is record time in the server side
  • delay is simply the diff of server time and client side


Some exercises:

Count the number of records with a particular delay:
SELECT stime - time AS delay,
       COUNT(*) FROM logs
GROUP BY stime - time;
+-------+----------+
| delay | count(*) |
+-------+----------+
|     0 |        1 |
|     2 |        3 |
|     3 |        3 |
|     4 |        1 |
|     5 |        2 |
+-------+----------+
Too much detail (too granulated), so we want to group 2 and 3 delay into 2, 4 and 5 into 4, etc. To make sure I am doing the right calculations, here is the delay and rounded delay:
SELECT *,
       stime - time AS delay,
       FLOOR((stime - time) / 2) * 2 AS rounded_delay
FROM   logs;
+------+------+-------+-------+---------------+
| stid | time | ctime | delay | rounded_delay |
+------+------+-------+-------+---------------+
|    1 |   10 |    10 |     0 |             0 |
|    1 |   10 |    12 |     2 |             2 |
|    1 |   15 |    17 |     2 |             2 |
|    1 |   13 |    15 |     2 |             2 |
|    1 |   18 |    21 |     3 |             2 |
|    1 |   19 |    22 |     3 |             2 |
|    1 |   20 |    25 |     5 |             4 |
|    1 |   21 |    26 |     5 |             4 |
|    1 |   22 |    26 |     4 |             4 |
|    1 |   23 |    26 |     3 |             2 |
+------+------+-------+-------+---------------+
So now group by the rounded delay (named just "delay" here)
SELECT FLOOR((stime - time) / 2) * 2 AS delay,
       COUNT(*) FROM logs
GROUP BY FLOOR((stime - time) / 2) * 2;
+-------+----------+
| delay | count(*) |
+-------+----------+
|     0 |        1 |
|     2 |        6 |
|     4 |        3 |
+-------+----------+
Add a percentage, I know I have 10 rows in my table so I can do this:
SELECT *,
       c.cnt / 10 AS percentage
FROM (
      SELECT FLOOR((stime - time) / 2) * 2 AS delay,
             COUNT(*) AS cnt
      FROM   logs
      GROUP BY FLOOR((stime - time) / 2) * 2
) c;
+-------+-----+------------+
| delay | cnt | percentage |
+-------+-----+------------+
|     0 |   1 |     0.1000 |
|     2 |   6 |     0.6000 |
|     4 |   3 |     0.3000 |
+-------+-----+------------+
Although, probably is better not to hard-code the count:
SELECT *,
       c.cnt / ( SELECT COUNT(*) FROM logs) AS percentage
FROM (
       SELECT FLOOR((stime - time) / 2) * 2 AS delay,
              COUNT(*) AS cnt
       FROM   logs
       GROUP BY FLOOR((stime - time) / 2) * 2
) c;
+-------+-----+------------+
| delay | cnt | percentage |
+-------+-----+------------+
|     0 |   1 |     0.1000 |
|     2 |   6 |     0.6000 |
|     4 |   3 |     0.3000 |
+-------+-----+------------+
There should be much better ways (better performance) of doing this... I am glad to receive some comments :)

2 comments :

Ignacio Enriquez said...

すごぉ〜い!とってもわかりやすいですね!
貴方は天才です!

Ignacio Enriquez said...

yeah!