I have a table likes the below.
id session_id start_time answer_time
1 111 2022-12-06 13:40:50 2022-12-06 13:40:55
2 111 2022-12-06 13:40:51 Null
3 111 2022-12-06 13:40:57 Null
4 222 2022-12-06 13:40:58 Null
5 222 2022-12-06 13:41:10 Null
6 222 2022-12-06 13:41:10 Null
7 333 2022-12-06 13:46:10 2022-12-06 13:46:15
8 333 2022-12-06 13:46:18 2022-12-06 13:46:20
There are three sessions in the table, with session ids 111, 222, and 333; Each session has multiple records, but the session_id is the same; and the session is successful or unsuccessfulis depends on answer_time
is Null or not of the smallest id record of that session.
The id 1
and id 4
and id 7
records in the above sample table determine whether a session is successful or unsuccessful.
I have the below SQL to query it, and it works well.
WITH t AS
(
SELECT DISTINCT ON (session_id) start_time, answer_time
FROM logs
WHERE ((SELECT NOW() AT TIME ZONE 'UTC') - start_time < interval '24 HOURS')
ORDER BY logs.session_id, id
)
SELECT
COUNT(*) FILTER (WHERE (answer_time IS NOT NULL)) AS sccess_count,
COUNT(*) FILTER (WHERE (answer_time IS NULL)) AS fail_count
FROM t;
But if the DB table have about 50M records, the query taken 20 seconds, this is unacceptable in the production environment, how can I optimize it? My goal is less than 1 second for the 50M records.
Edit, the below is the table SQL file:
/*
Navicat PostgreSQL Data Transfer
Source Server : Test Server
Source Server Type : PostgreSQL
Source Server Version : 140004 (140004)
Source Host : localhost:5832
Source Catalog : pserver1
Source Schema : public
Target Server Type : PostgreSQL
Target Server Version : 140004 (140004)
File Encoding : 65001
Date: 07/12/2022 19:53:07
*/
-- ----------------------------
-- Table structure for logs
-- ----------------------------
DROP TABLE IF EXISTS "public"."logs";
CREATE TABLE "public"."logs" (
"id" int8 NOT NULL,
"company_id" int8 NOT NULL,
"session_id" int8 NOT NULL,
"start_time" timestamp(6) NOT NULL,
"answer_time" timestamp(6)
)
;
-- ----------------------------
-- Indexes structure for table logs
-- ----------------------------
CREATE INDEX "calllog_start_time_idx_copy1" ON "public"."logs" USING btree (
"start_time" "pg_catalog"."timestamp_ops" DESC NULLS FIRST
);
CREATE INDEX "idx_calllog_session_id_copy1" ON "public"."logs" USING btree (
"session_id" "pg_catalog"."int8_ops" ASC NULLS LAST
);
-- ----------------------------
-- Triggers structure for table logs
-- ----------------------------
CREATE TRIGGER "ts_insert_blocker" BEFORE INSERT ON "public"."logs"
FOR EACH ROW
EXECUTE PROCEDURE "_timescaledb_internal"."insert_blocker"();
-- ----------------------------
-- Primary Key structure for table logs
-- ----------------------------
ALTER TABLE "public"."logs" ADD CONSTRAINT "calllog_copy1_pkey" PRIMARY KEY ("id", "start_time");