Computer Science selected for an AWS Pilot program in Machine Learning

We are very happy to have been selected for an SageMaker Pilot for AWS Educate Classrooms! Machine Learning (ML) is a top hard skill for graduates, and it is also becoming a premier tool for research in all areas. SageMaker Studio is a complete development environment for ML.

The theory of ML can always be taught, but in order to have hands on experience with ML, a computing infrastructure is required that is beyond the means of most educational institutions. Our students will have access to AWS Educate accounts with credits to use the SageMaker Studio environment, and access to to powerful CPU/GPU resources (ml.m5.xlarge, ml.c5.xlarge, and ml.g4dn.xlarge) for training ML models.

ML use cases include SPAM filtering for emails, recommender systems, e.g., Netflix show recommendations, and uncovering credit card fraud. There are three types of ML: supervised, where the data is labeled and the expected outputs are well understood (is an, is this email SPAM or not); unsupervised, where the ML algorithm has to discover the salient properties of the data; and, reinforcement, where some agent (e.g., RoboMaker) interacts with an environment and learns to navigate it through a system of rewards.

We applied last July to be part of the AWS pilot program to make SageMaker available to our students, and we were approved to start this fall 2020. We have a group of about 10 students who are going to be learning to use under my supervision.

We are building on our growing expertise in Artificial Intelligence. This fall term, professor Reza Abdolee is teaching a graduate class in AI (COMP569) and professor Bahareh Abbasi is teaching both an undergraduate course in AI (COMP469) and a graduate class in Neural Networks (COMP572).

ML is one of the areas of AWS certification.

Students will learn a variety of auxiliary tools; as you will see from this list, the Python programming language is central to Data Analytics:

  • Jupyter Notebook and Jupyter Lab: an open-source web application that allows the creation and sharing of documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, etc.
  • Pandas: a software library written for the Python programming language for data manipulation and analysis. In particular, it offers data structures and operations for manipulating numerical tables and time series.
  • Seaborn: a library for making statistical graphics in Python. It is built on top of Matplotlib and closely integrated with Pandas data structures.
  • Scikit-learn: a free software machine learning library for the Python programming language. It features various classification, regression and clustering algorithms.
  • Matplotlib: a comprehensive library for creating static, animated, and interactive visualizations in Python.
  • NumPy: a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.
  • PyTorch (AWS testimonials): an open source machine learning library based on the Torch library, used for applications such as computer vision and natural language processing, primarily developed by Facebook’s AI Research lab.

In the words of Jose Cahue:

One of the major hurdles to learn ML as a student is having access to a machine optimized for model training. Cloud computing can be one practical solution to provide the computation resources needed to learn ML.

Jose Cahue

Our results in “Edge Covers” picked up by an Astronautics research group at MIT

In May 2018, Ryan McIntyre defended his masters thesis (at CSUCI under my supervision) on  Bounding the size of minimal clique covers. We followed up with a publication of the results in the Journal of Discrete Algorithms (https://doi.org/10.1016/j.jda.2018.03.002) [post], and now, two years later, our results are cited and built upon in an interesting paper Static beam placement and frequency plan algorithms for LEO constellations (https://doi.org/10.1002/sat.1345) written by an Astronautics research group at MIT.

What is interesting about this is the serendipitous manner in which results build on each other: our result consisted in a partial solution to an original problem in combinatorics posed by the itinerant mathematician Paul Ërdos (posed in the mid 1960s), which we then used to partially solve a problem related to string indeterminates (also in this case working on previous results of Joel Helling [post]), which are related to genetics. Now, our work is being used to solve the problem of satellite allocation.

CI master students’ research accepted at the KES2020 international conference in Verona

KES 2020 in Verona but virtual

CSUCI Master of Computer Science students were successful in submitting two papers to KES 2020, the 24rd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, which this year is taking place in Verona, Italy, in September 2020. However, due to the COVID pandemic, the conference will be held virtually. The papers are the following:

  • Malware Persistence Mechanisms, co-authored by Zane Gittins and Michael Soltys. Zane Gittins is a masters student in Computer Science at CSUCI, and this paper is the result of his masters thesis. Zane Gittins has worked as a Cybersecurity experts at HAAS, and currently is working at Meissner Filtration. (This paper will be presented in the General Track session G3b: Cybersecurity.)
  • Voyager: Tracking with a Click, co-authored by Samuel Decanio, Kimo Hildreth and Michael Soltys. Sam Decanio is a masters student in Computer Science at CSUCI, and this paper is the result of his masters thesis and a fruitful collaboration between Computer Science at CI and the SoCal High Technology Task Force. Sam Decanio is currently working at the Navy. (This paper will be presented in the General Track session G3b: Cybersecurity.)

Teaching online: ten suggestions for success

https://aws.amazon.com/education/education-webinars/

On April 7, 2020, I gave a talk during an AWS Webinar in a series on Remote Learning (as advertised in this post, mentioned here, and here).

The webinar can now be watched on YouTube:


(or here on the original page), and the slides that I used for the presentation are the following:

Michael-Soltys-TeachingOnline

The ten points for success are summarized here:

Point 1: Don’t think of this move to online teaching as a one-off; this is the new normal. At California State University we have had to move to online teaching practically every year in the last five years: fires (twice), shootings, and now the pandemic. So think of the COVID-19 pandemic as an opportunity to build an online offering that can serve your department and students for years. You should have an online version for all your classes, not only for emergencies, but also to be responsive to the current reality where so many students want online offerings.

Point 2: There are two initial “shifts” in the move to online teaching. First, the pedagogical shift to not teaching in the classroom, where it is easy to connect with students physically present, to read facial expressions and adjust your teaching accordingly, to chat with some of them in person after class.  Second, the shift to a different usage of tools, or a different set of tools altogether: Zoom, Canvas, Piazza, MyITLab, Slack, Microsoft Teams, and of course AWS Educate offerings. Both “shifts” require some time; e.g., think of how you are going to compensate for lack of physical presence, and do not start learning Zoom half an hour before the first class. 

Point 3: In Point 2 we mentioned the challenge of not having the students physically present; how are you going to compensate for the lack of interaction that you are used to? I use Slack to create a collaborative environment in the class. I dedicate a channel to the course, and include all the students in the channel. Students can interact with me (the instructor), but even more importantly, they can interact with each other, and they do! Here appears one advantage of online teaching: often, as the students sit to write down a difficulty they encounter in the course, by the act of writing it in a public forum, they concentrate more than they do when asking verbally in class, and the question is better formed and often the answer appears in the process. Also, having those interactions recorded in the channel allows us to point them out later if the question comes up again. Further interaction comes by using Zoom on a regular basis, both to teach, and to have office hours / question periods. 

Point 4: In Point 2 we mentioned the challenge of shifting to a new set of tools. For Computer Science faculty this is relatively easy from the technical perspective. We are familiar with cloud-based tools, and our students like IT tools, and so the move is seamless. What can be problematic is how these tools are deployed; that is, the heavy reliance on these tools can make the course about them instead of making them ancillary to the objective of the course. The solution here is to explain, or even better automate, the aspects of the tools that are not intrinsic to the topic being taught. For example, we use AWS Educate accounts to teach our Computer Architecture class (COMP 262), a sophomore course where student learn about different microprocessor architectures and assembler level programming. Being able to deploy AMI (Amazon Machine Images) with certain architectures frees the student to concentrate on the point of the exercise: the differences in architecture. 

Point 5: It is important to be creative. More material can be taught successfully online than one would expect. For example, we have a senior elective in “mobile robotics” (COMP 470), which includes a lot of hands on lab work. It may seem hopeless to simulate such a course online, but it is not – we used the material in AWS Educate RoboMaker class to create virtual labs. Students can be given the relatively inexpensive robots (e.g., Amazon Deep Racer, ~$300 each), and participate in a lab by doing the hands-on activity at home, but testing and competing in a virtual environment in the cloud. 

Point 6: Do not think of online teaching as simulating classroom teaching. It is a different entity, with its advantages and disadvantages; concentrate on the advantages. For example, simply using Zoom to deliver a lecture at the same times as a regular lecture won’t do. Your lecture will be dry, you will feel frustrated as you feel as if you were talking into your own screen instead of a classroom full of students. Use Zoom to create an interactive environment, including quizzes (there are some nice tools to deliver interactive quizzes which always awaken a sense of fun competition along students; e.g., Kahoots, Quizzez), Zoom breakout rooms, question and answer sessions, presentations by students, etc.

Point 7: Grading has to be changed. For example, rely more on assignments, as in a final assignment rather than a final exam. Tests and exams can still be given, but I would suggest to give them as multiple-choice quizzes with limited times per question, in order not to make them exercises in who can Google-search faster. 

Point 8: In my experience online teaching has to be very well structured and organized, and the communication with the class has to be excellent: frequent, repetitive and complete. Students should know exactly what they need to do each week, and where to go with questions.

Point 9: Communicate enjoyment, passion and enthusiasm for the material. One of the most important roles of a teacher is to reassure the student that time spent with you, and the effort required to master your difficult material, is a worthy pursuit.  Tell the students what is the treasure that they will possess upon completion, what we dryly call SLO (Student Learning Outcomes), but which is the raison d’être for your course. Present your online offering not as “the 2nd best given the circumstances”, but rather as a great opportunity to work with others in an online setting – remember, this is the direction in which the IT world is moving, and students will benefit greatly from having the experience of being self-motivating, accountable and working with others online.

Point 10 (Bonus for Comp Sci instructors): Some material can be taught very easily online. For example, I prefer to teach programming classes in a blended online environment, even when we do not have a crisis! The reason is that Amazon Cloud9 is a perfect cloud-based IDE (Integrated Development Environment) that has many advantages over a machine-in-a-lab IDE: first, everyone has exactly the same environment, which I can customize to the needs of the course as precisely as I choose, and everyone can access this environment independently of the type of computer they have, as all it requires is a wi-fi connection and a browser. It also allows me to enter the environment from the “outside”, and code with the student watching my changes. This is really fantastic!

Alfred Camposagrado at Northrop Grumman

Alfred Camposagrado is a Principal Embedded Software Engineer at Northrop Grumman. He received his Bachelor’s in Computer Science at CSUCI in 2014. He started his journey in Camarillo working as a Software Engineer for Crescendo Interactive shortly after graduation. He gained valuable experience by initially starting as a front-end developer and later promoted to a Full-Stack developer focusing on Java. His experience in Java landed him a job at Northrop Grumman. Located in Point Mugu, he supports the US Navy with various projects from software development to system integration tests. He also continues his education at CSUCI in the Masters of Computer Science Program (MSCS). http://linkedin.com/in/alfredcamposagrado

Governor’s Cybersecurity Task Force (GCTF)

I am very happy to be part of the California Governor’s Cybersecurity Task Force (GCTF), serving on the Workforce Development and Education Subcommittee. The main objective of this subcommittee is to address the growing workforce gap; currently, there are 37,000 available cybersecurity positions in California, and 314,000 in the nation. About 70% of those positions require a 4 year degree or more.

The aim of our subcommittee is three fold: to enrich and standardize the educational pathway from K12 to PhD/Certification; to teach a general Cyber hygiene, both to the workforce and the public; and to help military, especially veterans, transition into civilian careers in Cybersecurity.

Computer Science at CI is well positioned to address some of the challenges:

  • A thriving program in Computer Science, with a minor in Cybersecurity; we are part of CyberWatchWest, we have a Cybersecurity student club, and we teach courses in Cybersecurity at the undergraduate and graduate level.
  • Experience in “hands-on” education, which is one of the aims of the workforce development. We have strong connections with the industry and the public sector (such as the SoCal High Technology Task Force).
  • An ongoing collaboration with the Navy, and have worked with both Navy officer and civilians as instructors and collaborators.

Please read more here.

SEAKER

Raspberry Pi controller, the hardware for SEAKER

In the summer 2017, while I was teaching COMP 524 (Cybersecurity) at California State University Channel Islands, the students were introduced to a project based on an R&D from the SoCal High Technology Task Force (HTTF). The requirements and specifications asked for a device that could automate the search through vast amounts of data contained in portable devices (such as hard disks and thumb-drives), looking for pre-established patterns in file-names.

The students designed and prototyped a device the we christened SEAKER (Storage Evaluator and Knowledge Extractor Reader), based on a Raspberry Pi, with a custom designed version of Raspbian (the OS running on Raspberry Pis), and a bash shell script for cloning such devices. The first presentation of SEAKER took place on August 7, 2017, to an audience composed of CI faculty and students, as well as investigators from the SoCal HTTF.

As SEAKER was being developed, it was presented at various other venues, for example:

We have also published the research resulting from the SEAKER project:

  • As the masters thesis of Eric Gentry, April 2019 [pdf]
  • In the proceedings of the 2019 Future of Information and Communication Conference (FICC) [doi]
  • To appear in the proceedings of the 2019 23rd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems (KES), track: Cybercrime Investigation and Digital Forensics

The Beast project

The Beast at the SCHTTF forensic lab

In September of 2018, a group of CI students, working on their senior capstone project under my supervision, started to build a machine capable of massive parallel computing. We christened the machine “The Beast.” We undertook to build the machine following the specification of the So Cal High Technology Task Force (HTTF) digital forensics lab in Ventura County.

The Beast was built with five EVGA GeForce GTX 1080Ti, capable of massive computational parallelism, a MSI Z370-A-Pro motherboard, a i5-8400 CPU, as well as a Hydra II 8 GPU 6U Server Mining Rig Case, and power supplied capable of maintaining four big fans; cooling The Beast was an important part of the project.

Presenting The Beast at the Capstone Showcase

The students who participated in the project were, in alphabetical order, Noelle Abe, Benjamin Alcazar, Matthew Atcheson, Joshua Buckley, Joshua Carter, John Miller, Scott Slocum, Ryan Torres and Devon Trammell (the team leader). On May 2nd, after working on the project during both terms of 2018/19, and having overcome many technical difficulties, the team presented The Beast at the Computer Science Advisory Board Meeting and the Computer Science Capstone Showcase; following these presentation, The Beast was handed over to the SoCal HTTF digital forensics lab. As you can see from the first picture above, The Beast has settled in its new home, a cooling room at the HTTF lab.

Computer Science 4th Advisory Board Meeting

On May 2, 2019, we held our fourth bi-annual Computer Science Advisory Board Meeting. The meeting started with lunch at the top (3rd) floor of Broome library, and continued with a two hour set of presentations in the Handel Evans room, also at Broome.

agenda

  • 12:00 PM – Lunch, 3rd Floor of Broome Library
  • 12:50 PM – Transition to J Handel Evans (Broome Library Rm 2533) 
  • 1:00 PM – Welcome, Agenda Overview & Introductions – Chris Meissner
  • 1:15 PM – Department Overview – Michael Soltys
    • Student numbers
    • Faculty updates and hires
  • 1:30 PM – Welcome from the Dean – Vandana Kohli  
  • 1:40 PM – Student Presentation
    • Robotics – student Steven Romp
    • Beast – students Noelle Abe and Devon Trammell Soltys
    • CS Club – students Julia Maliauka and Ori Weiss
    • CS Coding club – student Michael Petracca
    • CS Girls Club – students Noelle Abe and Maria Contreras
    • CS Cybersecurity Club – student Richie Zins
  • 2:10 PM – Member profile – The Trade Desk – Zak Stengel, SVP Engineering
  • 2:30 PM – Discussion – Chris Meissner and Michael Soltys
    • How do we become a world class department?
    • How do we become a hub of expertise?
    • Examples of where we already achieve partially these goals
    • But we need help from the board to get there
  • 3:00 PM  – Transition to Capstone Showcase, Sierra Hall

presentation slides

AdvisoryBoard-MichaelSoltys

SRomp

ThePasswordBeast

Clubs-2019

NETSEC

Gallery

Pictures from the Capstone Showcase.

Brandon Artner software developer at Yardi

Brandon Artner is a Software Development Engineer at Yardi Systems. He graduated from CSUCI with a degree in Computer Science and Mathematics in 2018. While he was an undergraduate at CSUCI he was an Intern Software Engineer with TRAX International hosted by GBL Systems. He also worked at the STEM Center for four years as a CS and Math tutor. During his studies, he also contributed to research projects involving shape analysis and thin coat instrumentation. His senior capstone project was developing a cryptographic voting system under the guidance of Professor Soltys.