sql日常笔记

使用sql复制表数据 (insert into xx(xx, xx, xx) select)

使用insert into select可以直接复制表数据，同时进行可以在其中进行其他操作，比如本次使用的示例

insert into result(task, acquisition_time, area, color, mapExtent, imgPath, cloud, land_id, c_time) select task, DATE_ADD(acquisition_time, INTERVAL 45 DAY), area, color, mapExtent, imgPath, cloud, land_id, c_time from result where year(acquisition_time)=2022;

其中使用了DATE_ADD方法，用于更改数据副本的时间，使用year筛选某一年的所有数据

使用sql生成随机数 (rand) (floor)

生成随机小数，使用rand()*10 会生成0-10之间的随机小数，如果想要整数，可以使用floor方法

select floor(rand()*100)

sql更新全表

这里使用随机数更新全表的一个字段

update table_name set column = floor(rand()*100)
或者
update table_name set column = select (floor(rand()*100))

字段值替换 (replace)

将字段的 http://替换为https://

update table_name set column = raplace(column, "http://", "https://")

##更新多个值 (update)
原sql

update result set cloud= floor(rand()*100) where cdate in (select cdate FROM result GROUP BY cdate)

如果直接运行会报错1093 - You can't specify target table 'result' for update in FROM clause
解决办法，增加一个中间表，如下,即可解决

update result set cloud= floor(rand()*100) where cdate in (select t from (select cdate as t FROM result GROUP BY cdate) as tt)

上面的语句并没有真正解决问题，因为有多个时间，每个事件有多条记录，但是现在希望相同的日期值也相同，所以使用下面的方法

##替换批量更新（replace into ）
原文链接
使用方法replace into table (id, column2) values (1, v2), (1, v22) 也可以是replace into table(id, v1) selcet 1, v1
需要注意的是 replace into 跟 insert 功能类似，不同点在于：replace into 首先尝试插入数据到表中， 1. 如果发现表中已经有此行数据（根据主键或者唯一索引判断）则先删除此行数据，然后插入新的数据。 2. 否则，直接插入新数据。
所以这里不太适合直接更新某个字段

##更新查询出的数据（update xx innor join select xx）
update table innor join (select column from table) as temp_table on table.column=temp_table.column set column1=value1,基础使用是这样实际情况还需要更多的随即及应变
比如相同的日期给相同的随机值,如下sql

update result INNER JOIN (SELECT adate, FLOOR(rand()*100) as num from result GROUP BY adate) as t on szy_analyse_result.adate=t.adate SET cloud = t.num

就完成了对相同日期赋予相同的随机值

##复杂sql的编写
很多的BI页面需要对数据进行统计，一般都会写很多复杂的sql查询，在写这类查询的时候可以将复杂问题简化为简单问题，可以试着理解这篇文章

mysql日期差函数(datediff(date1, date2))

在不同的sql中datediff使用方法不尽相同，在mysql中这个方法返回的是相差的天数，postgre中没有这个方法

字符串截取（substring）

当展示列表时，不需要展示全部数据，又要一部分某个字段的数据，可以使用这个函数substring(column, 0, 20)从头截取20位，参数个数可一个也可三个，类似python切片，但是在实际的使用中感觉效果不是很好，其他方法例如，使用空间换时间，在新增或求改数据的时候就将markdown数据格式化为一个截取过的数据，单独保存在一个字段中，查询的时候直接去这个字段

mysql对重复数据进行去重

有些时候就是有那么一些需求，要对某些字段重复的数据进行去重，当然直接写python脚本可以实现，但是更快的方法还是直接写sql
先使用简单的sql查看一下数据的重复情况

select title, max(id) as m, count(id) as c from items GROUP BY title

结果

title	m	c
[程序员] 一款帮助发现英文单词的 Chrome 扩展	1958	3
[酷工作] 恭喜🎉 被 ...，会 react 加分❗️]	2	3
[问与答] 这个是什么字体	3	1

在简化一下上面的sql直接得到全部去重后的id

select max(id) from items GROUP BY title

这样就得到了要保留数据的id，剩下的可以用 not in 条件进行删除，

delete from items where id not in (select max(id) from items GROUP BY title)

但是这个语句执行后会报错1093 - You can't specify target table 'items' for update in FROM clause,不能在子句中更新目标表，这个问题也很好解决，生成一个中间临时表然后再进行更新或删除操作（删除也可以看作是更新的一种），语句就变成

delete from items where id not in (select temp_table.i_d from (select max(id) as i_d from items GROUP BY title) as temp_table)

中间多了一次select，也就是temp_table临时表。成功删除重复的数据。

时间查询

时间相关函数

# 当前日期时间
select now();  # 2023-09-15 03:38:08
# 当前日期
select curdate(); # 2023-09-15
# 当前日期，这个获取的是UTC的
select curtime(); # 03:39:38

# 获取当前的单个年或月等
select extract(year from now()); # 2023 其中year还可以是 month,day,hour,minute,second
# 或者从时间字符串直接获取
select extract(second from "2022-02-02 02:02:02") # 2

时间增加、减少

# 时间减少1小时
select date_sub(now(), interval 1 hour)
# 时间增加1天
select date_add(now(), interval 1 day)

日期的格式化

日期时间转字符串

select date_format(now(), "%Y-%m-%d %H:%i:%s") # 2023-09-15 03:53:47
select date_format(now(), "%Y-%m-%d %H:00:00") # 2023-09-15 03:00:00

字符串转日期时间

select str_to_date('2021-04-02 10:37:14', '%Y-%m-%d %H:%i:%s');