We have a set of users
CREATE TABLE `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(254) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_email` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
Each user can have one or many domains, such as
CREATE TABLE `domains` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` varchar(11) NOT NULL,
`domain` varchar(254) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `domain` (`domain`),
CONSTRAINT `domains_user_id_fk` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
And we have a table that has some sort of data, for this example it doesn't really matter what it contains
CREATE TABLE `some_data` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`content` TEXT NOT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
We want certain elements of some_data
to be accessible to only certain users
or only certain domains
(whitelist case).
In other cases we want elements of some_data
to be accessible to everyone BUT certain users
or certain domains
(blacklist case).
Ideally we would like to retrieve the list of domains that the given element of some_data
is accessible to in a single query and ideally do the reverse (list all the data the given domain has access to)
Our approach so far is a single table
CREATE TABLE `access_rules` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`rule_type` enum('blacklist','whitelist')
`some_data_id` int(11) NOT NULL,
`user_id` int(11) NOT NULL,
`domain_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
CONSTRAINT `access_rules_some_data_id_fk` FOREIGN KEY (`some_data_id`) REFERENCES `some_data` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPRESSED
The problem however is the fact that we need to query the db twice (to figure out if the given data entry is operating a blacklist or a whitelist [whitelist has higher priority]). (EDIT: it can be done in a single query)
Also since the domain_id is nullable (to allow blacklisting / whitelisting an entire user) joining is not easy
The API that will use this schema is currently hit 4-5k times per second so performance matters.
The users
table is relatively small (50k+ rows) and the domains
table is about 1.5 million entries. some_data
is also relatively small (sub 100k rows)
EDIT: the question is more around semantics and best practices. With the above structure I'm confident we can make it work, but the schema "feels wrong" and I'm wondering if there is better way