博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
DELETE_DROP
阅读量:4210 次
发布时间:2019-05-26

本文共 21263 字,大约阅读时间需要 70 分钟。


思维导图

这里写图片描述


概述

delete from t where create_time < to_date('2017-01-01','yyyy-mm-dd');
  • 1

当我们看到这条语句时,会想到什么呢? 一条再简单不过的按照条件删除数据库的操作。 如果大量存在,会不会引起系统性能问题呢?

当这个表的数据足够大时,按照这样的方式来清除数据,代价无疑是非常高昂的。

我们提出使用分区的方式来解决这个问题。比如 保留历史数据的骨折是,删除最早一个月的数据,按照这个需求,将需要删除数据的表,改为分区表。 按照每个月一个分区的方式创建分区表。

这样做的结果是,以前的delete操作就变成了下面的操作:

alter table t drop partition p_name或者 alter table t truncate partiton p_name ;

这个操作对已一个几GB数据的分区表来讲,执行时间不超过10S。

比起使用分区来操作数据,DELETE方式操作数据的问题是显而易见的,至少存在如下两点:

  • 消耗大量的系统资源
  • 无法释放空间

DELETE与系统资源

delete 操作消耗的资源

在DELETE操作中,SQL语句首先要扫描表或者索引找到符合条件的记录,然后把他们删除,这个过程中将消耗大量的CPU资源和产生大量的I/O,同时还会产生大量的undo数据。

下面是一个简单的栗子,我们可以看出delete相对于drop或者truncate对资源的消耗情况:

SQL> drop table t purge ;  --->  不放入回收站,直接删除Table droppedSQL> create table t as select a.OBJECT_ID,a.OBJECT_NAME from dba_objects a;Table createdSQL> exec dbms_stats.gather_table_stats(user,'t');---->表分析PL/SQL procedure successfully completedSQL>

注释:执行drop table xx 语句 

drop后的表被放在回收站(user_recyclebin)里,而不是直接删除掉。这样,回收站里的表信息就可以被恢复,或彻底清除。 
通过查询回收站user_recyclebin获取被删除的表信息,然后使用语句 
flashback table <user_recyclebin.object_name or user_recyclebin.original_name> to before drop [rename to 
]; 
将回收站里的表恢复为原名称或指定新名称,表中数据不会丢失。 
若要彻底删除表,则使用语句:drop table <table_name> purge; 
清除回收站里的信息 
清除指定表:purge table <table_name>
清除当前用户的回收站:purge recyclebin; 
SQL> purge recyclebin; 
Done 
清除所有用户的回收站:purge dba_recyclebin; 
不放入回收站,直接删除则是:drop table xx purge;


oracle@entel1:[/oracle]$sqlplus / as sysdbaSQL*Plus: Release 11.2.0.4.0 Production on Tue Jan 17 22:17:53 2017Copyright (c) 1982, 2013, Oracle.  All rights reserved.Connected to:Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionWith the Partitioning, OLAP, Data Mining and Real Application Testing optionsSQL> conn xgj/smartConnected.SQL> SQL> set autotrace trace exp stat; -->显示执行计划和统计结果SQL>  delete from t where t.object_id < 10000;9811 rows deleted.Execution Plan----------------------------------------------------------Plan hash value: 3335594643---------------------------------------------------------------------------| Id  | Operation      | Name | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT   |      |  4420 | 22100 |    43   (0)| 00:00:01 ||   1 |  DELETE        | T    |   |   |        |      ||*  2 |   TABLE ACCESS FULL| T    |  4420 | 22100 |    43   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("T"."OBJECT_ID"<10000)Statistics----------------------------------------------------------     18  recursive calls      10465  db block gets    162  consistent gets      0  physical reads    2640572  redo size    831  bytes sent via SQL*Net to client    797  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)       9811  rows processedSQL>

SET AUTOT 用法

用法: SET AUTOT[RACE] {OFF | ON | TRACE[ONLY]} [EXP[LAIN]] 

[STAT[ISTICS]]

SET AUTOT[RACE] OFF 停止AutoTrace SET AUTOT[RACE] ON 

开启AutoTrace,显示AUTOTRACE信息和SQL执行结果 SET AUTOT[RACE] TRACEONLY 
开启AutoTrace,仅显示AUTOTRACE信息 SET AUTOT[RACE] ON EXPLAIN 
开启AutoTrace,仅显示AUTOTRACE的EXPLAIN信息 SET AUTOT[RACE] ON 
STATISTICS开启AutoTrace,仅显示AUTOTRACE的STATISTICS信息

在这种情况下,DELETE操作产生了 10465 db block gets + 162 consistent gets 数据块读,同时产生了 2640572 redo size 的 redo日志 (2640572 /1024/1024 = 2.5M 大小)

即使我们给这个表创建了索引,依然无法避免这种资源消耗

SQL> rollback;Rollback complete.SQL> create index ind_t on t(object_id);Index created.SQL> exec dbms_stats.gather_index_stats(user,'ind_t');PL/SQL procedure successfully completed.SQL> delete from t where t.object_id < 10000;9811 rows deleted.Execution Plan----------------------------------------------------------Plan hash value: 3974964266---------------------------------------------------------------------------| Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT  |   |  4420 | 22100 |    11   (0)| 00:00:01 ||   1 |  DELETE       | T     |   |   |        |      ||*  2 |   INDEX RANGE SCAN| IND_T |  4420 | 22100 |    11   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - access("T"."OBJECT_ID"<10000)Statistics----------------------------------------------------------     15  recursive calls      10596  db block gets     36  consistent gets      0  physical reads    2801592  redo size    837  bytes sent via SQL*Net to client    796  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)       9811  rows processedSQL> rollback;Rollback complete.

可以看到即使加了索引,资源的消耗情况差不多,都是 10596 db block gets + 36 consistent gets 。


truncate 和 drop 操作消耗的资源

相反的 truncate 和 drop 操作消耗的资源就小很多。

案例

Connected to Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 Connected as xxx@xgjSQL> --->创建分区表SQL> create table t_part (object_id int ,object_name varchar2(2000) )  2         partition by range(object_id)  3              (  4               partition p1 values less than (6000),  5               partition p2 values less than (12000),  6               partition p3  values less than (18000),  7               partition p4  values less than (24000),  8               partition p5 values less than (30000),  9               partition pm values less than (maxvalue) 10              );Table createdSQL> insert into t_part select object_id ,object_name from dba_objects;35238 rows insertedSQL> commit;Commit completeSQL> select count(1)  from t_part  partition(p1);  COUNT(1)----------      5995SQL>

下面我们使用sql_trace来跟踪下 drop和truncate 之类的DDL操作的相关信息。

SQL> alter session set tracefile_identifier='xgj_20170118';Session alteredSQL> alter session set sql_trace=true;Session alteredSQL> alter table t_part truncate partition p1;Table truncatedSQL> truncate table t_part;Table truncatedSQL>  insert into t_part select object_id ,object_name from dba_objects;35238 rows insertedSQL> commit;Commit completeSQL> alter table t_part drop partition p1;Table alteredSQL> alter session set sql_trace=false;Session alteredSQL>

开启SQL跟踪后,会生成一个trace文件,通过初始化参数user_dump_dest配置其所在目录,该参数的值可以通过下面方法获取到:

SQL> select name, value from v$parameter where name = 'user_dump_dest' ;NAME                                   VALUE--------------------------- -----------------------------user_dump_dest               /oracle/diag/rdbms/cc/cc/trace
  • 这里写图片描述

以上是产生的trace原文件,我们通过tkprof来查看一下:

oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$tkprof cc_ora_14104_xgj_20170118.trc   cc_ora_14104_xgj_20170118.log explain=xxx/xgj sys=no ;TKPROF: Release 11.2.0.4.0 - Development on Wed Jan 18 16:35:18 2017Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$

以下是截取cc_ora_14104_xgj_20170118.log中的关键信息:

truncate partition p1

********************************************************************************SQL ID: 4gj9z8z8707sx Plan Hash: 491940592alter table t_part truncate partition p1call     count       cpu    elapsed       disk      query    current        rows------- ------  -------- ---------- ---------- ---------- ----------  ----------Parse        1      0.00       0.00          0          5          0           0Execute      1      0.01       0.38         20          1         69           0Fetch        0      0.00       0.00          0          0          0           0------- ------  -------- ---------- ---------- ---------- ----------  ----------total        2      0.01       0.38         20          6         69           0Misses in library cache during parse: 1Optimizer mode: ALL_ROWSParsing user id: 38  (xxx)********************************************************************************

truncate table

********************************************************************************SQL ID: cv84y6hnt1dy2 Plan Hash: 491940592truncate table t_partcall     count       cpu    elapsed       disk      query    current        rows------- ------  -------- ---------- ---------- ---------- ----------  ----------Parse        1      0.00       0.00          0          0          0           0Execute      1      0.02       1.54         80          4        266           0Fetch        0      0.00       0.00          0          0          0           0------- ------  -------- ---------- ---------- ---------- ----------  ----------total        2      0.02       1.54         80          4        266           0Misses in library cache during parse: 1Optimizer mode: ALL_ROWSParsing user id: 38  (xxx)********************************************************************************

drop partition p1

********************************************************************************SQL ID: 7wwjnfwgcxgza Plan Hash: 0alter table t_part drop call     count       cpu    elapsed       disk      query    current        rows------- ------  -------- ---------- ---------- ---------- ----------  ----------Parse        1      0.00       0.00          0          0          0           0Execute      1      0.00       0.69          1          1         10           0Fetch        0      0.00       0.00          0          0          0           0------- ------  -------- ---------- ---------- ---------- ----------  ----------total        2      0.00       0.69          1          1         10           0Misses in library cache during parse: 1Optimizer mode: ALL_ROWSParsing user id: 38  (xxx)

从结果可以看到,DDL对分区操作消耗的资源要远小于DML的

操作 数据块的读取次数
delete 1000+
truncate 6
drop 1

可以看出DDL的操作要远小于DML消耗的资源。 DDL的操作主要是对数据字典的修改,这个值基本上是确定的,而DML会随着数据量的增加,消耗的资源也会相应的增加


DML操作会随着数据量的增加,消耗的资源也会相应的增加

oracle@entel1:[/oracle/diag/rdbms/cc/cc/trace]$sqlplus /nologSQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 18:04:57 2017Copyright (c) 1982, 2013, Oracle.  All rights reserved.SQL> conn xxx/xgjConnected.SQL> select count(1) from t;  COUNT(1)----------     35249SQL> set autotrace trace exp stat ; SQL> delete from t where object_id < 10000;9811 rows deleted.Execution Plan----------------------------------------------------------Plan hash value: 3974964266---------------------------------------------------------------------------| Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT  |   |  4420 | 22100 |    11   (0)| 00:00:01 ||   1 |  DELETE       | T     |   |   |        |      ||*  2 |   INDEX RANGE SCAN| IND_T |  4420 | 22100 |    11   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - access("OBJECT_ID"<10000)Statistics----------------------------------------------------------     45  recursive calls      10940  db block gets     52  consistent gets      0  physical reads    2806724  redo size    831  bytes sent via SQL*Net to client    794  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      3  sorts (memory)      0  sorts (disk)       9811  rows processedSQL> rollback;Rollback complete.SQL> delete from t where object_id < 20000;13938 rows deleted.Execution Plan----------------------------------------------------------Plan hash value: 3974964266---------------------------------------------------------------------------| Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT  |   |  8842 | 44210 |    21   (0)| 00:00:01 ||   1 |  DELETE       | T     |   |   |        |      ||*  2 |   INDEX RANGE SCAN| IND_T |  8842 | 44210 |    21   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - access("OBJECT_ID"<20000)Statistics----------------------------------------------------------     17  recursive calls      15016  db block gets     47  consistent gets      0  physical reads    3988132  redo size    836  bytes sent via SQL*Net to client    794  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)      13938  rows processedSQL> rollback;Rollback complete.SQL> delete from t where object_id <30000;14403 rows deleted.Execution Plan----------------------------------------------------------Plan hash value: 3974964266---------------------------------------------------------------------------| Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT  |   | 13263 | 66315 |    31   (0)| 00:00:01 ||   1 |  DELETE       | T     |   |   |        |      ||*  2 |   INDEX RANGE SCAN| IND_T | 13263 | 66315 |    31   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - access("OBJECT_ID"<30000)Statistics----------------------------------------------------------     17  recursive calls      15508  db block gets     48  consistent gets      0  physical reads    4121320  redo size    837  bytes sent via SQL*Net to client    793  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)      14403  rows processedSQL>

从结果可以看到,消耗的资源和删除的数据量是同步增长的

删除的数据量 数据块的读取次数
object_id<10000 10940 + 52
object_id<20000 15016 +47
object_id<30000 15508+48

truncate的消耗

t_part 表 

这里写图片描述

t_part_2表

SQL> create table t_part_2 (object_id int ,object_name varchar2(2000) )  2         partition by range(object_id)  3              (  4               partition p1 values less than (20000),  5               partition pm values less than (maxvalue)  6              );Table createdSQL> insert into t_part_2 select object_id ,object_name from dba_objects;35239 rows insertedSQL> commit;Commit completeSQL> alter session set tracefile_identifier='xgj_truncate';Session alteredSQL> alter session set sql_trace=true;Session alteredSQL> alter table t_part_2 truncate partition p1;Table truncatedSQL> alter session set sql_trace=false;Session alteredSQL>

这里写图片描述

我们可以看到,truncate操作时,数据块读的次数10次左右,所以DDL操作并没有随着操作的数据增加而明显的增加,它只取决于需要操作的字典表的数据量。


DELETE与释放空间

DELETE操作并不能释放出空间,也就是说,删除了哪个表的数据,疼出来的空间还是只能是那个表使用,并不能让给其他的对象使用,因为delete操作并不能使高水位线下降。 如果需要通过清除一个表中的部分数据来腾出空间给其他的对象,这个操作时不行的。

而truncate或者drop操作,就可以释放出空间给其他的对象使用。

下面举例说明:

我们还是用我们上面创建的t_part分区表为例

SQL> select * from user_extents  a where a.segment_name='T_PART'; SEGMENT_NAME                                                  PARTITION_NAME SEGMENT_TYPE TABLESPACE_NAME  EXTENT_ID      BYTES     BLOCKS-------------------------------------------------------------------------------- ------------------------------ ------------------ ------------------------------ ---------- ---------- ----------T_PART                                                                   P1                           TABLE PARTITION TAB_ZMC                       0    8388608       1024T_PART                                                                   P2                           TABLE PARTITION TAB_ZMC                       0    8388608       1024T_PART                                                                   P3                           TABLE PARTITION TAB_ZMC                       0    8388608       1024T_PART                                                                   PM                         TABLE PARTITION TAB_ZMC                       0    8388608       1024T_PART                                                                   P5                           TABLE PARTITION TAB_ZMC                       0    8388608       1024SQL> select count(1)  from user_extents  a where a.segment_name='T_PART';   COUNT(1)----------         5SQL>

最初的时候,T_PART表的数据占据了5个EXTENTS(区)

我们通过delete删除部分数据

SQL> select count(1)  from t_part;  COUNT(1)----------     35245SQL> delete from t_part a where a.object_id<30000;14403 rows deletedSQL> commit;Commit completeSQL> select count(1)  from user_extents a where a.segment_name='T_PART';  COUNT(1)----------         5SQL>

我们删除了14403 条数据后,发现T_PART仍然占据着5个EXTENDS,也就是说虽然表中的数据删除了,但是空间只能是T_PART自己使用,并没有将控件返回给数据库。

我们使用truncate 来试下 ,用T表的数据吧,比较多

SQL> select count(1)  from  user_extents a where a.segment_name='T';  COUNT(1)----------        17SQL> select count(1)  from t;  COUNT(1)----------     35245SQL> truncate table t;Table truncatedSQL>  select count(1)  from  user_extents a where a.segment_name='T';  COUNT(1)----------         1SQL> select count(1)  from   2  t;  COUNT(1)----------         0SQL>

可以看出 通过truncate 之后 T表占用的数据区从17个降为1个,大部分空间都返回给了数据库,这部分空间是可以分配给其他数据使用的。

还有一种方式,就是如果这个表上有许多碎片,可以通过整理表的空间来达到释放空间的目的,命令如下

alter table  t  move ;
  • 1

我们用T表来做下试验,还原T表的数据

SQL> insert into t select object_id ,object_name from dba_objects;35245 rows insertedSQL> commit;Commit completeSQL> select count(1)  from user_extents a where a.segment_name='T';  COUNT(1)----------        17SQL> delete from t ; --->通过delete方式删除数据35245 rows deletedSQL> commit;Commit completeSQL> select count(1)  from user_extents a where a.segment_name='T';  COUNT(1)----------        17SQL> alter table t move;--->重新整理数据Table alteredSQL>  select count(1)  from user_extents a where a.segment_name='T';  COUNT(1)----------         1SQL>

我们看到,使用alter table move 操作,表的空间基本上已经全部释放出来了,只剩下一个EXTENT(因为当前已经是一个空表了)。

通过alter table move 方式让表中的数据重新存储,这样可以释放出来一些空间,但是没这个操作基本上等同于将表中的数据重新分布一次,如果表中的数据比较庞大,这将是一个非常耗时的操作。


实际上delete的操作只适用删除非常少量的数据,并且需要在索引存在的情况下适用。

案例说明:

SQL> select count(1)  from t;  COUNT(1)----------         0SQL> insert into t select object_id ,object_name from dba_objects;35245 rows insertedSQL> commit;Commit completeSQL> select  a.index_name ,a.index_type ,a.table_name  from user_indexes a where a.table_name = 'T'; --->查看表上的索引INDEX_NAME       INDEX_TYPE      TABLE_NAME------------------------------ --------------------------- ------------------------------IND_T                    NORMAL           TSQL> drop index ind_t ;-->先删掉索引Index droppedSQL>  select  a.index_name ,a.index_type ,a.table_name  from user_indexes a where a.table_name = 'T';INDEX_NAME       INDEX_TYPE      TABLE_NAME------------------------------ --------------------------- ------------------------------SQL>

看看执行计划吧

oracle@entel1:[/oracle]$sqlplus /nologSQL*Plus: Release 11.2.0.4.0 Production on Wed Jan 18 20:40:16 2017Copyright (c) 1982, 2013, Oracle.  All rights reserved.SQL> conn xxx/xgjConnected.SQL> set autotrace trace exp stat;SQL> delete from t where object_id=888;1 row deleted.Execution Plan----------------------------------------------------------Plan hash value: 3335594643---------------------------------------------------------------------------| Id  | Operation      | Name | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT   |      | 1 | 5 |    43   (0)| 00:00:01 ||   1 |  DELETE        | T    |   |   |        |      ||*  2 |   TABLE ACCESS FULL| T    | 1 | 5 |    43   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - filter("OBJECT_ID"=888)Statistics----------------------------------------------------------      4  recursive calls      4  db block gets    209  consistent gets      0  physical reads      0  redo size    830  bytes sent via SQL*Net to client    790  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)      1  rows processedSQL>

在没有索引的情况下,即使删除很少的数据,Oracle也会将全表扫描一遍(分区表除外)。

此时,Oracle读取了209 个数据块,如果我们肩上索引,效果会好很多。

SQL> create index ind_t on t(object_id);Index created.SQL> delete from t where object_id=888;1 row deleted.Execution Plan----------------------------------------------------------Plan hash value: 3974964266---------------------------------------------------------------------------| Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |---------------------------------------------------------------------------|   0 | DELETE STATEMENT  |   | 1 | 5 | 1   (0)| 00:00:01 ||   1 |  DELETE       | T     |   |   |        |      ||*  2 |   INDEX RANGE SCAN| IND_T | 1 | 5 | 1   (0)| 00:00:01 |---------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------   2 - access("OBJECT_ID"=888)Statistics----------------------------------------------------------      1  recursive calls      5  db block gets      2  consistent gets      4  physical reads      0  redo size    834  bytes sent via SQL*Net to client    790  bytes received via SQL*Net from client      3  SQL*Net roundtrips to/from client      2  sorts (memory)      0  sorts (disk)      1  rows processedSQL>
你可能感兴趣的文章
解析漏洞总结
查看>>
【Windows C++】调用powershell上传指定目录下所有文件
查看>>
kotlin-android-extensions 插件无效问题
查看>>
经典排序算法--Java实现
查看>>
Java中JRadioButton单选按钮分组方法
查看>>
Java图形界面中单选按钮JRadioButton和按钮Button事件处理
查看>>
小练习 - 排序:冒泡、选择、快排
查看>>
SparkStreaming 如何保证消费Kafka的数据不丢失不重复
查看>>
Spark Shuffle及其调优
查看>>
数据仓库分层
查看>>
常见数据结构-TrieTree/线段树/TreeSet
查看>>
Hive数据倾斜
查看>>
TopK问题
查看>>
Hive调优
查看>>
HQL排查数据倾斜
查看>>
DAG以及任务调度
查看>>
LeetCode——DFS
查看>>
MapReduce Task数目划分
查看>>
ZooKeeper分布式锁
查看>>
3126 Prime Path
查看>>