Blog

Indeterminates in London

Today my masters student Joel Helling presented our joint work, Constructing an Indeterminate String from its Associated Graph,  co-authored with P.J. Ryan, W.F. Smyth, at the LSD & LAW 2017 conference. This work started with the visit of W.F. Smyth at CSU Channel Islands in March 2016 – see here, when Joel, Bill and I started working on an initial algorithm designed by Joel; this algorithm computed a labeling for a graph such that any two nodes that shared an edge would also share a label, and would not share a label otherwise. Of course, the easy way to do this is to assign each edge a unique label; but there are better ways of doing it, in the sense that fewer labels may be required (for example, imagine a clique: then a single label would do it). This problem is directly related to indeterminate strings; strings that arise in bio-informatics, where certain positions may consist of several possible symbols. For example, {a,b}ab{a,b,c}{b,c}a, where the first position is not determined (it could be a or b), the second position is a, the third b, the fourth is again not determined (it could be a or b or c), etc. Such strings are well suited to represent genetic
information which often contains “noise”. The paper is now accepted for publication in Elsevier’s journal of Theoretical Computer Science,  as well as presented by Joel at the conference.

I would like to express my gratitude to the British Royal Society for awarding us an exchange grant, CSU Channel Islands / King’s College London, which financed the trip of Joel Helling.

CI Facebook post.

Running experiments on AWS

I can’t say how happy I am to have AWS. I just got the account set up, started my first instance, and run a simulation for a very interesting project that I am working on with Ryan McIntyre (a student in CS). What took about 15min on my Mac Pro quad core, took 1m40s on the AWS instance.

This is a brave new world! 🙂 . Here us the summary of the experiment:

~/EdgeGraph/EdgeGraph$ time python3 cover_vs_edges.py 
How many vertices? 7
Generating graphs...
Filtering isomorphisms...
Sorting graphs...
Checking up to 21 edges...
0 / 21 edges complete.
1 / 21 edges complete.
2 / 21 edges complete.
3 / 21 edges complete.
4 / 21 edges complete.
5 / 21 edges complete.
6 / 21 edges complete.
7 / 21 edges complete.
8 / 21 edges complete.
9 / 21 edges complete.
10 / 21 edges complete.
11 / 21 edges complete.
12 / 21 edges complete.
13 / 21 edges complete.
14 / 21 edges complete.
15 / 21 edges complete.
16 / 21 edges complete.
17 / 21 edges complete.
18 / 21 edges complete.
19 / 21 edges complete.
20 / 21 edges complete.
21 / 21 edges complete.
elapsed time: --- 96.4 seconds ---

real 1m40.000s
user 1m36.812s
sys 0m0.113s

White House announces boost to computer science education

The White House announced on Monday new initiatives to bolster computer science in K–12 education.

Citing the rapidly expanding demand for technology jobs, the Obama administration outlined new efforts by two federal agencies: The National Science Foundation plans to spend $20 million on computer science education in 2017, on top the the $25 million it spent in 2016, with an emphasis on training teachers.

And the National Science and Technology Council will create a framework to help guide federal efforts “to support the integration of computer science and computational thinking into K–12 education,” according to Monday’s release.

The two agencies’ efforts, it said, will complement the Obama administration’s wider efforts to expand science, technology, engineering and math (STEM) in education.

The White House announcement comes in conjunction with new commitments to computer science education by 250 organizations, including Bootstrap, STEMteachersNYC and the American Association of Physics Teachers.

Other announcements include Google’s new computer science career prep program for college students and the University of North Texas’s partnership with the Perkins School for the Blind and the California School for the Blind.

Computer science is playing an increasingly large role in STEM — nearly two-thirds of all STEM jobs require computing skills. Despite the large need and an overwhelming desire by parents for their children to learn computer science, only about 40 percent of schools offer classes on the subject.

Source: White House announces boost to computer science education | TheHill

Assistant/Associate Professor in Algorithms and Data Structures, Complexity Theory, or Discrete Mathematics

DTU Compute’s Section for Algorithms, Logic and Graph theory (AlgoLoG), invite applications for an appointment as Assistant or Associate Professor within Algorithms and Data structures, Complexity theory, or Discrete mathematics. The position is available from 01.05.2017 or according to mutual agreement.

Source: Assistant/Associate Professor in Algorithms and Data Structures, Complexity Theory, or Discrete Mathematics

Graduate positions at Bonsai group from University of Lille and Inria (France), starting January 2017

A position for a master thesis (4-6 months) and a PhD grant (36
months) are available in the Bonsai group from University of Lille and
Inria (France), starting January 2017 or later.

The Bonsai research group focuses on efficient algorithms and data
structures for high-throughput sequencing data.  This position is
funded by a national collaborative project on Nanopore sequencing,
that gathers groups in computational biology and wet labs (ANR ASTER).
The main objective is to define a general purpose local alignment
algorithm that is able to handle long error-prone reads, using
advanced techniques such as dimensionality reduction, seed-and-extend
methods, alignment-free methods. The post requires a high level
of expertise in discrete algorithms and good programming skills (C++).

For more information on the position and the application process,
please contact Laurent Noé and Hélène Touzet
(laurent.noe@univ-lille1.fr,helene.touzet@univ-lille1.fr)

Rabin Miller algorithm: witnesses of compositness

The Rabin Miller algorithm for primality testing relies on a random selection of a’s for which it performs some basic tests in order to determine whether the input n is prime or composite. If n is prime, all a’s work, but if n is composite it is possible to show that at least half of the a’s work, i.e., at least half of the a’s are witnesses of compositness. But in fact, it seems like a lot more a’s than just half of those in the set {1,2,…,n-1} work; see the following diagram where I test all composites under about 60,000. It is clear that there are more good witnesses than just half the numbers.

plot-witnesses

Talk by Jan Holub on pattern matching in genomes

Talk by Jan Holub, Exact and Approximate Pattern Matching in Genomes, on Tuesday November 1, at 6pm, in Sierra 1422.

Bio: Jan Holub is a professor at Faculty of Information Technology in Czech Technical University in Prague. He leads a research group called the Prague Stringology Club and organizes annual conference Prague Stringology Conference. He received his Ph.D. in 2000. He spent some time in University of Marne-la-Vallee, France, McMaster University, Canada, Ecole polytechnique, France, and Aalto University, Finland. Currently he is a Fulbright scholar at Pennsylvania State University, USA. His research interests cover stringology (computer science about algorithms over strings), data compression, and bioinformatics.

flyer_jholub