CHEM623 (Chemometrics)

Syllabus for Fall, 2011

Texts & Resources Catalog Description Course Requirements
Grading Calendar General Course information
Prof. Steven Brown
Office: 239 Brown Laboratory   Office Hours: MT 12-1, R 10-11
Voice mail: 831-6861 
E-mail: sdb@udel.edu
Fall 2011 
9:05-9:55 MWF  
116 BrL
Texts & Resources
Texts/Resources/Readings/Supplies

Required Texts:

Chemometrics with R, by R. Wehrens, Springer 2011. ISBN: 978-3-642-17841-2. (Now that I have seen this book in its entirety, I am not all that happy with it. We will discuss options at the first class- which should be Sept. 2nd

Matlab: A Practical Introduction to Programming and Problem Solving, 2nd Ed., by S. Attaway, Butterworth-Heinemann, 2011. ISBN: 978-0123850812.
(This was published August 11, 2011 so there may be a delay in the bookstore receiving it.)

Both texts are available as e-books - I am not sure how quickly one can get either in that format, however.

There will also be required readings from handouts and papers from the literature available for download at the Chem 623 site on Sakai.

You will need to be registered for the course as a student or as a listener to gain access to the Sakai site.

Other Required Items:

1. Ready access to a suitable computer where you can install and use Matlab data analysis software. This can be a group computer or your own computer. You will probably need access to a printer and the internet from this computer.

2. Matlab computational software installed and functional on your computer. You can use Matlab at some campus computing sites, get access to your group's Matlab software or get Matlab software (for campus use only) from Pat McMahon. If you don't have easy, constant access to a recent version of this software, you will need to purchase a student version directly from Mathworks. A copy of Matlab prior to Version 7 may not work with some of the chemometrics software provided. If you purchase a student version, be sure to get it directly from Mathworks to avoid getting out-of-date software.

Note: If you already know a suitable data analysis language, you may use it in this course. Doing work in R, Igor, or Mathematica is acceptable for homework and exams. You will need to cheerfully accept the limitations imposed by your choice and port any chemometrics software that I provide you in Matlab to the languge you have selected, or find a suitable replacement chemometrics package coded in the language you selected. Both R and Matlab have chemometrics packages, but other languages (so far as I am aware) do not. While Excel has one, please note that Excel/Visual Basic is not an acceptable language for work in this course. In my and others' opinion, Excel cannot be trusted for anything beyond routine spreadsheet calculations.

3. Software from Chem 623 for use in the Matlab computational environment. This semester, we will use the last public-domain version (v1.3) of the PLS_Toolbox for demonstrations and answers to homework. You are permitted to use it and any other public domain software as well as any code that I provide or code that you write yourself on all homework and for the exams. The Matlab-based software in the PLS_Toolbox v1.3 as modified by me to work with recent versions of Matlab is free for you to use in the course and beyond, but there is no support available from Eigenvector, the company that now sells the Toolbox. You must be enrolled in the course to get any of the free chemometrics software, all of which will be available from the Sakai site.

Materials to be used during the course include: simulations, short video clips and analysis of some datasets using software provided and software available from the internet. You will need to dowload data sets and software packages and to check some work against published results available on the internet. You will be asked to submit your work in digital form (as a Word 'doc' file or a PDF file) via Sakai because of the length of the answers to homework sets and exams. In general, long lists of output will not be acceptable as an answer to homework or exam questions in this course; you will need to become proficient at plotting and summarizing results of analyses. Matlab or other results must be integrated into your work, and may not be submitted as separate files.

Learning Resources:
Full lecture notes will be available in PDF format online, through Sakai. I will also provide PDF or hardcopy of supporting material.
You are expected to attend lecture and to read material that I provide. I am unable to provide full online transcripts for all demonstrations and video clips shown in the course, but you will have source code in Matlab for much of the material presented.

All lectures in this course will be recorded- as will any in-class questions and my answers - and will be made available on UDCapture. I will make up missed classes by recording the lectures for UD Capture.
Make up lecture dates and times will be announced. Students are welcome to attend these. Please note that I retain copyright on all materials associated with the course, except where noted, and you will need to get written permission from me to distribute course materials.
 
Student feedback on instruction:
I will ask for student feedback at midterm for course/instructor improvement purposes. There will also be an end-of-term student evaluation with a supplement to departmental student evaluation form. I welcome comments and constructive criticism at any time..
Back to top

Catalog Description

Chemistry 623 is a graduate-level, overview course in the analysis of data generated from instrumentation used in chemistry, biochemistry and related fields. The emphasis is on the understanding and practical application of chemometric methods. The course is intended for graduate students or for advanced majors in chemistry, biochemistry or chemical engineering who need to analyze data obtained from such instruments. This course presumes some knowledge of basic statistics and some prior exposure to simple computational computer programming. Brief reviews of concepts from probability, decision theory and experimental designs are provided to provide background, as is a discussion of processing of chemical signals to improve signal quality. The course's main focus is on the systematic evaluation of high-dimensional data through multivariate calibration and classification of multivariate chemical responses.

Back to top

Course Requirements and Policies

Course Requirements

This course is an introduction to computational analysis of data from chemical instrumentation. Chemometrics involves some math, so you will need to become comfortable with probability, statistical tests, matrix algebra and regression. You will learn a mixture of theory and practice, and will be asked to implement the theory in working computer code. Much of the code that you will need will be made available to you, but you will need to know how to make changes to existing code to do some of the work required. This skill will enable you to make use of the large code base available on the web.
 
Each homework set will involve some theory, some computation and some critcal evaluation of results.


Instructor Absences:


The instructor has invited conference presentations scheduled for 8/30-9/1/2011, 9/24-10/4 /2011, 10/13-16/2011, and 11/16/2011. Every effort will be made to find times to make up missed lectures because of these absences.


Course Policies

Academic Honesty:

You are encouaged to become familiar with The University’s Policy of Academic Honesty found in the UD Student Guide to University Policies. More on the whole issue of academic integrity can be found here. Policies delineated in the Guide apply to this course. While homework sets for Chem 623 can be done in collaboration with others enrolled in the course, all work on the out-of-class examinations must be done entirely independently. By turning work into the instructor of this course, you acknowledge being made aware of the academic honesty policy and affirm your adherence to the letter and spirit of the policy.

Assignments:

Homework deadlines are posted and you are expected to meet the deadlines. If you have a problem and cannot make a deadline, please let me know. I may be able to allow some extra time for a once-only problem. Repeated late work will be penalized. Work missed for a reason - documented illness or family emergency, and conference travel (but only if you advise me by e-mail well in advance of the travel), etc. - can be made up without penalty.

Back to top

Grading

Grading, Evaluation Policies and Procedures:

The course will be marked on the basis of your performance on homework, on a 3-day take-home midterm exam and on a 5 day, take-home, calculation-based final exam. The grade given will be determined on the basis of the total number of points earned.

The distribution of points is as follows:

Task                                                        Points

Homework (5 sets, each worth 20 pts):     100 pts
MidTerm Exam (11/2-11/7/11):                 100 pts
Final Examination(12/7-12/14/11):            100 pts

TOTAL:                                                   300 pts



Grading Scale:

    > 240 pts   A
181-239 pts   B
121-180 pts   C
 60 -120 pts   D
     < 60 pts    F

The average grade earned by previous students in this course has been B+.
Be aware that this course has a substantial workload: it requires steady, focused effort from a student.

Back to top

Calendar

Tentative Schedule for Lectures:

All lectures are scheduled for 0905-0955 MWF in 116 Brown Laboratory.
This schedule given below is approximate and may vary to reflect scheduling changes and student needs.

Note that the instructor will be away several days (see above) during this semester because of previously arranged conference presentations.
Make-ups will be scheduled if possible and will be recorded on UD Capture.

Week      Topics to be Covered

8/29/11    Overview of Chemometrics, Introduction to Random Variables

9/5/11      Estimation, Confidence Intervals and Statistical Testing

9/12/11    Linear Regression Methods

9/19/11    Linear Regression Methods

9/26/11    Methods for Modeling Multivariate Data

10/3/11    Chemical Calibration - 1 - Curve Fitting Data

10/10/11  Chemical Calibration - 2- Classical and Inverse Analysis

10/17/11  Chemical Calibration - 3- Soft Calibration

10/24/11  Chemical Calibration - 4- More on PLS Regression

10/31/11  Chemical Calibration - 5- Multiway Methods

MIDTERM EXAM 11/02-11/07/11 (Take Home, Computer-Based)

11/7/11    Classification Methods -1- Supervised Methods

11/14/11  Classification Methods -2- Unsupervised Methods

11/21/11   Classification Methods - 3- Advanced Methods

11/28/11   Self-Modeling Methods in Evolving Systems

12/5/11    Preprocessing and Signal Correction Methods

FINAL EXAMINATION 12/07-14/11 (Take Home, Computer-Based)

Back to top

General course information

Course information

Course pre-requisites:

This course presumes some knowledge of chemical instrumentation at the level of Chem 437 and Chem 438. Students should also have had an exposure to basic statistics as covered in an elementary statistics course or in Chem 120. Prior experience with scientific programming is not required but will be helpful.

Course Description:

This course covers an introduction to analysis of data from chemical instrumentation.

A brief review of basic statistics and probability is given. Regression methods are introduced to model the sources of variance, and approaches are covered to develop, evaluate and improve regression models.Soft modeling is developed as a way to deal with bias created from model-data mismatches, and soft modeling-based projections are shown effective at visual examination of multivariate data such as spectra. Methods are then presented to relate multivariate data to group and to external properties. Prediction of group membership is demonstrated, and prediction of external property is discussed in some detail. A brief introduction to signal processing is provided. As a final topic, methods for discovery of underlying chemical signatures of the pure components comprising a mixed response is discussed and methods for systematic discovery of those components are presented

Course Objectives:

Students completing this course should be able to read, understandand, and critically evaluate literature making use of basic techniques as used in computational statistics and chemometrics. They should also be able to perform basic chemometric data analysis by developing their own MATLAB (or R) code or by modifying code provided by others.

Completion of this course will provide the student a foundation for research in measurement-oriented chemistry. It also prepares students for more advanced work in computational modeling or chemometrics and for applying chemometric methods in other research projects.


Departmental Objectives:

This course meets Departmental Objectives 1, 2,4, 5, 6, 9.

Back to top


Last Updated: 26 August 2011

Copyright © 2011 University of Delaware