这个确实不太好做 之后多看看
题目
The Numbers table keeps the value of number and its frequency.
+----------+-------------+
| Number | Frequency |
+----------+-------------|
| 0 | 7 |
| 1 | 1 |
| 2 | 3 |
| 3 | 1 |
+----------+-------------+
In this table, the numbers are 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 2, 3, so the median is (0 + 0) / 2 = 0.
Write a query to find the median of all numbers and name the result as median.
审题
之前做过一个分组中位数的题
这个是给定一个序列各元素的频率再找出中位数
生成数据
DROP TABLE Numbers;
CREATE TABLE Numbers(
Number INT,
Frequency INT);
INSERT INTO Numbers VALUE(0,7),(1,1),(2,3),(3,1);
解答
给出累计频率
SELECT N.`Number`, N.Frequency ,@Interval:=@Interval + N.Frequency AS Inte
FROM Numbers AS N, (SELECT @Interval:=0) AS init
ORDER BY N.Number;
给出每个数的排名的区间 太复杂了 不考虑这个了
SELECT tmp1.Number, tmp1.Linte, tmp2.RInte
FROM (SELECT N.`Number`, @Interval:=@Interval + N.Frequency AS LInte, @num:=@num+1 AS num
FROM Numbers AS N, (SELECT @Interval:=0, @num:=0) AS init
ORDER BY N.Number) tmp1
LEFT JOIN (SELECT N.`Number`, @Interval1:=@Interval1 + N.Frequency AS RInte, @num1:=@num1+1 AS num1
FROM Numbers AS N, (SELECT @Interval1:=0, @num1:=0) AS init
ORDER BY N.Number) tmp2
ON tmp1.num +1 = tmp2.num1;
又涉及到奇偶性的问题 统一成一个问题选取 如果累计频率在[总频率数/2, 总频率数/2 + 当前频率数]即可
AccFreq范围在[SumFreq / 2, SumFreq / 2 + Frequency]的Number均值即为答案。
先把总频率加入表中
SELECT *
FROM (SELECT N.`Number`, N.Frequency ,@Interval:=@Interval + N.Frequency AS Inte
FROM Numbers AS N, (SELECT @Interval:=0) AS init
ORDER BY N.Number) AS t1, (SELECT SUM(N.`Frequency`) AS all_cnt
FROM Numbers AS N) AS t2
按如上方法选择
SELECT *, t2.all_cnt/2 AS Left_int, t2.all_cnt/2 + t1.Frequency AS Right_int
FROM (SELECT N.`Number`, N.Frequency ,@Interval:=@Interval + N.Frequency AS AccFreq
FROM Numbers AS N, (SELECT @Interval:=0) AS init
ORDER BY N.Number) AS t1, (SELECT SUM(N.`Frequency`) AS all_cnt
FROM Numbers AS N) AS t2
WHERE t1.AccFreq BETWEEN t2.all_cnt/2 AND t2.all_cnt/2 + t1.Frequency;
偶数情况可能出现两个不同的数在区间边界 所以需要取均值
SELECT AVG(t1.Number) AS 'median'
FROM (SELECT N.`Number`, N.Frequency ,@Interval:=@Interval + N.Frequency AS AccFreq
FROM Numbers AS N, (SELECT @Interval:=0) AS init
ORDER BY N.Number) AS t1, (SELECT SUM(N.`Frequency`) AS all_cnt
FROM Numbers AS N) AS t2
WHERE t1.AccFreq BETWEEN t2.all_cnt/2 AND t2.all_cnt/2 + t1.Frequency;
网友评论