The ability of bacteria and archaea to modulate metabolic process, defensive
response, and pathogenic capabilities depend on their repertoire of genes and
capacity to regulate the expression of them. Transcription factors (TFs) have
fundamental roles in controlling these processes. TFs are proteins dedicated to
favor and/or impede the activity of the RNA polymerase. In prokaryotes these
proteins have been grouped into families that can be found in most of the
different taxonomic divisions. In this work, the association between the
expansion patterns of 111 protein regulatory families was systematically
evaluated in 1351 non-redundant prokaryotic genomes. This analysis provides
insights into the functional and evolutionary constraints imposed on different
classes of regulatory factors in bacterial and archaeal organisms. Based on
their distribution, we found a relationship between the contents of some TF
families and genome size. For example, nine TF families that represent 43.7% of
the complete collection of TFs are closely associated with genome size; i.e., in
large genomes, members of these families are also abundant, but when a genome is
small, such TF family sizes are decreased. In contrast, almost 102 families
(56.3% of the collection) do not exhibit or show only a low correlation with the
genome size, suggesting that a large proportion of duplication or gene loss
events occur independently of the genome size and that various yet-unexplored
questions about the evolution of these TF families remain. In addition, we
identified a group of families that have a similar distribution pattern across
Bacteria and Archaea, suggesting common functional and probable coevolution
processes, and a group of families universally distributed among all the
genomes. Finally, a specific association between the TF families and their
additional domains was identified, suggesting that the families sense specific
signals or make specific protein-protein contacts to achieve the regulatory
roles.