DTAMS – Middle Science Teacher Assessments


Diagnostic Science Assessments for Middle School Teachers serve two purposes: (1) to describe the breadth and depth of science content knowledge so that researchers and evaluators can determine teacher knowledge growth over time, the effects of particular experiences (courses, professional development) on teachers' knowledge, or relationships among teacher content knowledge, teaching practice, and student performance and (2) to describe middle school teachers' strengths and weaknesses in science knowledge so that teachers can make appropriate decisions with regard to courses or further professional development.

The assessments measure science knowledge in four content domains: (Physical Science; Life Science; Earth/Space Science). Each assessment is composed of 25 items—20 multiple-choice and 5 open-response. Six versions of each assessment are available in paper-and-pencil format so that researchers, professional development providers, and course instructors can administer them as pre- and post-tests before and after workshops, institutes, or courses to determine growth in teachers' content knowledge.

Teams of researchers analyzed a number of standards documents and research literature to synthesize the science content (detailed below) middle school teachers should know. Five types of knowledge (detailed below) were also identified. This provided a 2-dimensional chart within which questions were generated to ensure both breadth of coverage (content) and depth of coverage (knowledge type). Click on Middle School Science Content Summary Chart [PDF] to see a summary of the content analysis of these documents. The numbers in each cell represent page numbers in the documents and the letters (A1, PS3, NC6, . . .) represent bibliographic references for research articles. Science topics that were identified in more than half of the sources (A in the far right column) were included in the assessments. The chart below summarizes this structure for the physical science assessment. Click on Types of Science Knowledge for Middle School Teacher Assessments to see descriptions of the knowledge types.

Teams of practicing science teachers, science teacher educators, and scientists generated test items intended to simultaneously target a particular content area and a particular knowledge type. Assessment-wide, items were targeted to be balanced across both dimensions.

Establishing Validity

Test items for each content area were sent out to approximately 40 external reviewers from each of the same three groups (science teachers, science educators, scientists). These external reviewers categorized questions into a content category and a knowledge type. They also rated the appropriateness of each question and provided other suggestions for improving the questions.

Based on reviewer feedback, questions were selected, revised, and assembled into field tests. Parallel questions were generated to produce 6 versions of each content-area field test. Tests are designed to be completed by test-takers within an hour. Each test consisted of 20 multiple-choice and 5 open response-questions. Each assessment has 3-4 science subdomains. Click on Middle School Science Subcategories to see the specific topics in each subcategory. The table below summarizes the subdomains for each assessment:

Physical ScienceLife ScienceEarth/Space Science
Motion and ForcesInternal RegulationLithosphere

Each team developed item specification charts for each of the assessments. These charts describe the content and knowledge type of items on each of the four assessments. Click on Physical Science [PDF], Life Science [PDF], or Earth/Space Science [PDF] to view the item specification chart for each assessment.

Evidence of validity of the items for measuring teacher content knowledge in the various categories was established by asking external reviewers to review the items. Items were edited and sorted into randomized sets. They were sent to reviewers along with a review form that solicited: 1) the correct answer to the multiple choice items; 2) categorization of each item into a content category and subcategory; 3) categorization of each item into a knowledge type category; 4) a rating of the item as STS or not; and 5) a rating of the appropriateness of the item for middle school teachers.

Reviewers for each content assessment included scientists, science educators, and science teachers. Each item was reviewed by 27-31 reviewers in life science, 29-33 reviewers in physical science, and 20-22 reviewers in earth science. Each person reviewed about 75 items.

Data from the reviewers were analyzed to identify items that met criteria the DTAMS staff established for measuring the assigned constructs. The criteria for an item receiving verification as fitting a content category was that at least 75% of the reviewers identified the item as assessing a given category. To guarantee a balanced distribution within each category, subcategories within the categories had to be agreed upon by more than 50% of the reviewers; both of these criteria were required for an item to be accepted on the content category criterion. The knowledge type criterion was considered acceptable if more than 50% of reviewers rated the item as belonging to one type. For appropriateness, items that received an average ranking over 2.4 (on a scale of 1=low, 2=medium, 3=high) were considered appropriate. If an item met all three of those criteria, it was accepted to be included in the field tests. If it met two of the criteria, it was reviewed to determine if the wording could be clarified or improved. Revised items were or will be sent out for a second review. The items that met review criteria were selected to be the prototype for items in the field tests.

Using the Assessments

Currently these assessments are available for use free of charge. However, the assessments will be scored for a fee of $10 per teacher per assessment by CRMSTD staff. Once scored, CRMSTD staff will send instructors and professional development providers detailed summary of teachers' performance that includes scores on individual items, on each science subdomain in the content area, and on four different knowledge types (memorized, conceptual understanding, higher-order thinking, pedagogical content knowledge), allowing them to use the scoring summary to analyze performance on specific items, subdomain topics, or knowledge level.

Ordering the Assessments

Send an email to CRMSTD staff at CRMSTD indicating your interest with a brief description of your intended use (e.g. with Math-Science Partnership grant, for a research study, for other professional development purposes, etc.). Also include the following information to help us plan and schedule our scorers:

  • content area(s) you wish to use (Physical, Life, Earth/Space)
  • approximate dates of administration
  • approximate number of teachers completing assessments
  • contact information (with email address) to whom to return the completed scoring summaries and fee invoices

Frequently Asked Questions (FAQ)

  1. Is training required to administer the measurement tool?
    Training is not required. We provide a short document with administration instructions that are straightforward. Guidelines for use are also under development to help ensure test security.
  2. What are the costs involved?
    Costs are $10 per assessment per teacher. When the electronic scoring summary is sent, an invoice for that amount will be sent as well.
  3. How long does it take teachers to complete one assessment?
    The length of time to take the assessments generally varies between 30-50 minutes, with the bulk of participants taking less than 40 minutes. For the post-test, some teachers have more to say for the open-response questions and thus may take about 10 minutes longer.
  4. How are the assessments delivered to us, and what is the process to have them scored by you?
    The process for obtaining is as follows:
  • Our assessment coordinator sends the assessment(s) electronically via email to the administrator or coordinator ordering them.
  • The administrator or coordinator downloads the assessment(s) and makes as many copies as necessary for each teacher.
  • After administration, the administrator or coordinator mails the completed paper copies back to the CRMSTD for scoring at the address identified in Ordering the Assessments. The email address to whom to send the scoring summary and invoice should be included with the assessments to be scored.
  • The score summary is sent back electronically along with the fee invoice.
  • What are your recommendations for using these assessments?
    Our position is to leave it up to the clients to decide how these data will best serve them. Below are some examples of what others have done or are doing with these assessments:
    • We do not provide national norms for the assessment scores because the samples of teachers taking the assessment may or may not be representative of teachers as a whole. The assessments are intended for diagnostic purposes, and we suggest they are best used to measure growth or to identify strengths and weaknesses of individual teachers rather than comparing results to established benchmark scores.
    • Some project directors have administered all three content areas as a pre-test in order to use results to determine on which content area they would like to focus their upcoming professional development. CAUTION: Due to test fatigue, we recommend not administering all three of the assessments on the same day. In this case, we request that post-tests on all content areas also be done on a schedule convenient to you so that we can collect parallel form reliability data on the instruments.
    • Some project directors have chosen to focus primarily on one or more of the knowledge type subscores or content subdomain scores. For example, some users were more interested in the pedagogical content knowledge of teachers; others were interested in enhancing science inquiry skills; and still others wanted to focus on deep, schematic knowledge. Some have chosen to emphasize one or two content category subdomains, e.g. "force and motion" for physical science. We ask that complete assessments be administered to maintain the integrity of the assessments, but the client is free to use the various subscores that are returned as part of the score report in any way that is helpful to them. CAUTION: Since each of these subscores are based on a fewer number of items than the overall assessment, conclusions drawn from subscores alone are more tentative than those from total scores and should be done cautiously.
    • Some project directors have used these assessments in a pre-post design to look for gains, so that it is not necessary to have other norms. Some looked at gains in subscores (either knowledge type or content subcategory) as well as overall gains. The same caution applies as above.
  • If you have other questions about these assessments, please contact Dr. Thomas Tretter at 502-852-0595 or tom.tretter @ louisville.edu or Dr. Sherri Brown at 502-852-0599 or s.brown @ louisville.edu.