Blacklist / Whitelist Table Design

Ask Time：2019-07-23T19:23:50 Author：lePunk

We have a set of users

CREATE TABLE `users` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `email` varchar(254) NOT NULL,
   PRIMARY KEY (`id`),
   UNIQUE KEY `unique_email` (`email`)
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED

Each user can have one or many domains, such as

CREATE TABLE `domains` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `user_id` varchar(11) NOT NULL,
   `domain` varchar(254) NOT NULL,
   PRIMARY KEY (`id`),
   UNIQUE KEY `domain` (`domain`),
   CONSTRAINT `domains_user_id_fk` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED

And we have a table that has some sort of data, for this example it doesn't really matter what it contains

CREATE TABLE `some_data` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `content` TEXT NOT NULL,
   PRIMARY KEY (`id`),
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED

We want certain elements of some_data to be accessible to only certain users or only certain domains (whitelist case). In other cases we want elements of some_data to be accessible to everyone BUT certain users or certain domains (blacklist case). Ideally we would like to retrieve the list of domains that the given element of some_data is accessible to in a single query and ideally do the reverse (list all the data the given domain has access to)

Our approach so far is a single table

CREATE TABLE `access_rules` (
   `id` int(11) NOT NULL AUTO_INCREMENT,
   `rule_type` enum('blacklist','whitelist')
   `some_data_id` int(11) NOT NULL,
   `user_id` int(11) NOT NULL,
   `domain_id` int(11) DEFAULT NULL,
   PRIMARY KEY (`id`),
   CONSTRAINT `access_rules_some_data_id_fk` FOREIGN KEY (`some_data_id`) REFERENCES `some_data` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
 ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED

The problem however is the fact that we need to query the db twice (to figure out if the given data entry is operating a blacklist or a whitelist [whitelist has higher priority]). (EDIT: it can be done in a single query) Also since the domain_id is nullable (to allow blacklisting / whitelisting an entire user) joining is not easy

The API that will use this schema is currently hit 4-5k times per second so performance matters. The users table is relatively small (50k+ rows) and the domains table is about 1.5 million entries. some_data is also relatively small (sub 100k rows)

EDIT: the question is more around semantics and best practices. With the above structure I'm confident we can make it work, but the schema "feels wrong" and I'm wondering if there is better way

Author:lePunk，eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article：https://stackoverflow.com/questions/57163084/blacklist-whitelist-table-design

Blacklist / Whitelist Table Design