Instructor's Guide

As a prerequisite you should read the INGinious Teacher's documentation for the basics on task creation and API usage in the run scripts.

The following is my approach to using the grader and is intended as a tutorial and not a prescriptive set of rules. I make the following assumptions:

The code is in C++
You use CMake

Neither of these are inherent limitations of the grader. The container can add any compiler, interpreter, or other software you like. This is just the default in the containers I have created for 2574/3514 and 3574.

Basic Usage

General task recommendations:

Rate limit the number of submissions to prevent them from using the grader as their compiler and reduce the gader load. I have settled on 4 submissions per hour.
If you are using git to fetch the student code, be sure to check the box to allow internet access in the container.

Writing Grading Scripts

Writing good grading scripts is difficult and time consuming because it is hard to anticipate all the ways students can misinterpret instructions and specifications. The best approach I have found is to write a solution yourself with unit/integration/functional tests that cover all your code and then adapt those tests for grading.

Testing

To measure student code correctness I recommend using a simple testing framework, either doctest or catch. You can provide tests, or hide them and make them write there own, as appropriate for the pedagogical goal of the assignment and level of the course.

Compile your tests, e.g. instructor_tests.cpp using their code by appending a target to their CMakeLists.txt file. Note you need to detect if their code will not compile with your tests and provide appropraite feedback. See the examples below.
Run the tests, using a reporter (xml) to dump the output
Parse the reporter output to determine the number of failing tests and test case names/documentation that fail. This is the most important part fo giving good feedback. Note you need to detect if their code crashes with your tests and provide appropriate feedback (although this can be hard to do unless you run one test at a time).
Assign a grade for this part using the fraction of passing tests or some other rubric.

For CLI and TUI programs you can run tests using TCL Expect or pyexpect. For Testing GUI programs I suggest using Qt and the QtTest framework. Examples are provided below.

Static analyzers

To measure student code quality you can use static analyzers. I recommend the following:

Compile with at least -Wall -Wextra -Wshadow -Wconversion -Wpedantic and count number of warnings. Adding -Werror is an easy way to check there are no warnings as it will not compile. You can dump the compiler output into the feedback to give hints to the students. Providng a reference environment (VM or Docker) can help students verify their code works before submitting to the grader. You can also set Visual Studio to compile with /W4 /WX (in the CMakeLists.txt) to get a similar analysis.
optionally use clang's analyzer, it sometimes catches additional issues
optionally use cppcheck, but it can be overwhelming unless used from the start of a project
use clang-tidy with the modernize- options to have students use idiomatic modern C++

Memory checking

To measure memory safety you can use the clang sanitizers (Address, UB, and Memoor), valgrind, or DrMemory. Although it is slow I typically run the instructor tests through valgrind and dump the output to xml, then parse to count the number of errors. Common ones are leaks, use before initialize, and use after delete.

For checking threaded code in 3574 I use clangs Thread Sanitizer.

Coverage testing

If you are focusing on students testing you can/should measure the coverage of thier test code. The easiest way to do this is to use gcov and parse the ouput in Python using gcovr. See the example below for details.

Getting student grades back into Canvas

We could use LTI but it is not setup currently. You or a TA either copy the grade into Canvas.

Submissions: zip files versus GitHub

Students can write code directly in the grader task, upload a file, or submit via a git repository. The latter is the approach I recommend. Keep a mapping from student grader id to GitHub id and pull the code each time you grade. See the Python helper functions and examples below.

Python vs Shell run scripts

The grading script can either be in shell or python. I strongly prefer python unless it is just a simple test. I have several helper functions that I reuse to do common tasks.

Python helper functions for common tasks

Link to GH repo and list them/doc, how to reuse

Testing scripts in a local Docker container

When developing grading scripts I recommend working locally until you are ready to do the final check. You can check your scripts in the same Docker container the grader will use, replacing the API calls to local read/writes.

Install docker on Linux or Docker Desktop on Windows or MacOS
docker pull vtece/grader2-3514:latest or the desired container for your class
docker run -it -v $PWD:/scratch vtece/grader2-3574 /bin/bash This mounts the host directory in /scratch inside the running container.

This is how I do all my grader script development and testing. It saves a lot of time.

Giving the students a matching reference environment

You can give students an identical environment (compiler versions, etc) so they can debug their own code locally.

Using Vagrant and VirtualBox
Using Docker
Integration with VisualStudioCode devcontainers

Examples

Here are some typical examples with template/starter code you can adapt. See also the examples directory of the repository.

Example 1: grading a snippet of code

This might be an appropriate for an exercise in a low-level class. Suppose you want to see if the students can write hello world in C++. You create a grader task in your course with a subproblem with id "thecode" with the code submission type and some basic instructions.

screenshots

You then write the following "run.sh" script

#!/bin/bash

getinput "thecode" > student/student_code.cpp

output=$(c++ -o student/student_exe student/student_code.cpp)

if [ "$?" -eq 0 ]; then

    output=$(student/student_exe)

    if [ "$output" = "Hello World!" ]; then
    feedback-result success
    else
    feedback-result failed
    feedback-msg -em "Your program displayed : $output"
    fi
else
    feedback-result failed
    feedback-msg -em "Your program failed to compile:\n $output"
fi

Example 2: grading a non-interactive console executable

Example 3: grading a data structure with unit tests, valgrind, and coverage analysis

This is a typical assignment in 2574/3514. Students are given a data structure interface to code to, asked to implement and write unit tests. The grader script compiles thier code and tests, runs the tests, runs the tests with valgrind, and computes test coverage. Then their library code is compiled and run with instructors tests to check for functionality.