Daniel Spiegel (University of Colorado, Physics Department) Quantum Field Theory from a physicists point of view
Mar. 20, 2019 4pm (MATH 350)
Grad Student Seminar
Lucas Laird (CU Boulder)
X
Biological sequences are one of the primary types of data used in computational biology. Such sequences can be represented as -mers, strings of length with symbols chosen from a reference alphabet. In contrast, powerful machine learning algorithms often require numeric vector representations of the data that are low-dimensional and minimally distorted/biased. Hamming graphs and their so-called "resolving sets" have been shown to produce efficient vector embeddings of -mers. Here, we reduce the question of whether a set of nodes in a Hamming graph is resolving to solving a constrained linear system and show that such a system can be solved by finding the intersection of a matrix null space with the roots of a polynomial system. Gröbner bases then provide a highly efficient characterization of resolvability on Hamming graphs with the goal of improving -mer vector representations.
Resolvability of Hamming Graphs for Applications in Computational Biology