Compliance Apps, Modules, Questions, Tasks, and Answers¶

GovReady-Q is a governance, risk, and compliance platform for creating automated compliance processes ranging from gathering information from persons and computers to generating compliance artifacts.

Information gathering is at the heart of GovReady-Q’s guidedmodules data model, which handles compliance apps, modules, questions, tasks, and answers. Eight database tables make up the guidedmodules data model. The complete data model spans three categories:

Compliance apps, which are reusable packages of questions, business logic, and document templates, are defined by the database tables AppSource, AppVersion, Module, ModuleQuestion, and ModuleAsset. Many compliance apps, and different versions of the same compliance app, can be in use simultaneously.
Information submitted by end-users to answer compliance app questions, as well as information submitted through the GovReady-Q API, is stored in the database tables Task, TaskAnswer, and TaskAnswerHistory.
Imputed answers and documents generated by the business logic stored in compliance apps, which are computed on-the-fly.

The tables are described is additional detail below and their relationships are summarized in the following diagram:

Guidedmodules data model (not all tables represented)¶

Note

The image above was generated with:

python3 manage.py graph_models guidedmodules --disable-sort-fields  -X InstrumentationEvent > diagram.dot
dot diagram.dot -Tsvg > diagram.svg

Compliance Apps (Questions, Business Logic, and Templates)¶

The smallest unit of a compliance app is a Question. Questions come in different types, such as text, number, and date (full list). Questions are grouped into questionnaires called Modules. And Modules are grouped into AppVersion, which are the versions of a compliance app that are loaded into the GovReady-Q database. AppVersions are loaded from AppSources, which define how to load compliance apps from remote sources such as GitHub or an on-premises enterprise source control system.

AppVersion, Modules, and Questions define the structure of a compliance app but do not store any user-submitted content. Separating structure from content is a common pattern in application design and is motivated by several GovReady-Q goals:

Compliance apps are reusable and can be easily loaded into different installs of GovReady-Q at different organizations.
Anyone can author compliance apps and they can be kept private or shared publicly.
One type of Question allows the user to choose a complete set of answers to another Module, which allows question answers (i.e. user data) to be accessed from the business logic and templates of not only the Modules the Questions are defined in but also from other Modules and even other compliance apps.
Compliance apps are versioned and apps that have been started by users can be updated in non-destructive ways, preserving answered questionnaires over the course of years,

GovReady-Q administrators will configure GovReady-Q with an AppSource record for each source of compliance apps that will be made available to GovReady-Q users. Each AppSource has a “slug,” which is its unique name on the GovReady-Q installation. An AppSource also defines a remote location — such as a GitHub repository, local directory, or on-premise git server — to query for compliance apps, which will be listed in the app catalog. A typical GovReady-Q installation might have one AppSource providing compliance apps published by GovReady PBC and a second AppSource for providing compliance apps defined by the organization. Each AppSource defines the remote location as well as local permissions such as which compliance apps in the store to make available to GovReady-Q users and Organizations. See App Sources for more information.

Each time a compliance app is started by a user, an AppSource is queried for the latest version of the selected compliance app and an AppVersion is created that holds the complete set of questions, business logic, and templates defined in that version of the compliance app. Therefore there will be a separate AppVersion in the database for every version of every compliance app being used by GovReady-Q users.

Each AppVersion links back to the AppSource it was created from (the “source” field) and holds the name it was given (the “appname” field). Each AppVersion brings along with it the Modules and Questions defined in that version of the compliance app. In other words, Modules and Questions are specific to a AppVersion. Therefore if two versions of a compliance app are present in the database, a question that exists in both versions of the app is represented as two Question records — one for each version of the app. Similarly, there will be (at least) two Modules, one for each version of the app.

All compliance apps have at least two Modules. The first module, whose “module_name” identifier is always app, defines the layout of the starting page of a compliance app, which can list one or more Modules to complete:

compliance app sample 1¶

or a collection of modules and slots to start other compliance apps:

compliance app sample 2¶

In both cases, the “cards” that represent modules to answer or slots for compliance apps to start are defined by Questions in the “app” Module. When a user starts a module or selects a compliance app, that is recorded in the database as answering the respective Question in the “app” Module (more on that below).

Besides listing questions, Modules also define zero or more output documents. Each output document is generated by combining a template stored in the Module with user answers.

Similar to Modules, ModuleQuestions have a “key” field that uniquely identifies them within the Module they are defined in. ModuleQuestions store the question type (text, date, etc.), the prompt shown to users, impute conditions (see below), and other metadata.

ModuleAssets store a compliance app’s static assets used by the app’s templates. These assets often appear as images or other embedded media in output documents generated by the compliance app.

User Answers (Tasks and Answers)¶

When a user is completing the questions in a compliance app, their answers are stored in a separate set of database tables distinct from the tables used to store compliance app questions, business logic, and templates. The tables that hold answers are Task, TaskAnwser, and TaskAnswerHistory.

Task and TaskAnswer are parallel tables to Module and ModuleQuestion and are related to where user answers are stored. A Task is the instantiation of a Module that a GovReady-Q user or set of users are completing. A TaskAnswer is the instantiation of a ModuleQuestion that a GovReady-Q user has answered. All of the Tasks instantiated together for the same compliance app are related through the “project” field.

TaskAnswerHistory stores the complete history of user answers related to a TaskAnswer, i.e. to an instantiated question. The current answer to a question and its associated metadata are stored in the most recent TaskAnswerHistory record for a particular TaskAnswer (the one with the highest “id” value — “id”s are assigned to answers in strictly increasing order). Therefore only the most recent TaskAnswerHistory record for a TaskAnswer holds a current answer, and earlier TaskAnswerHistory records are for audit logging and tracking changes.

TaskAnswerHistory records have a “stored_value” field which holds the user’s answer encoded in JSON, other metadata such as “answered_by” for which user provided the answer, “skipped_reason” and “unsure” which are flags set if the user skipped the question or wants to return to it later, and “reviewed” which holds workflow review state (e.g. if a reviewer marks the answer as approved).

This data model supports GovReady-Q design goals, such as:

Compliance app modules and questions can be assigned to different users to answer.
The answer to questions may change while a complete history of answers are preserved in an immutable record, including preserving past answer metadata such as who answered the question and whether the answer was approved by a reviewer.
Answers are strongly typed: text, numbers, dates, choices, and so on are encoded in a JSON representation that preserves their data type.
All questions can be skipped by storing null in “stored_value.”

Imputed Answers and Output Documents¶

Compliance apps hold business logic and templates that are used to “impute” answers to questions and generate output documents, respectively. These computational outputs are not stored in the database. Instead, they are computed on-the-fly by GovReady-Q as they are needed, and the results of the computations are cached so long as they remain valid.

Imputed Answers¶

Imputation uses business logic rules to infer the answer to questions based on previous answers to questions. Imputation is used for a variety of purposes, such as:

Hiding questions that are not applicable based on the answers to previous questions, by imputing null as the answer to the question.
Pre-answering questions when the answer is known based on the answers to previous questions.
Running business logic computations, such as computing a grade or gap analysis, and storing the result of the computation as the answer to the imputed question.

Questions whose answers are imputed are not asked of the user — the user may never see these questions at all. Some questions are designed to always be imputed to support the execution of business logic rules.

The results of imputation are not stored in the database because they are computed on-the-fly to ensure that the GovReady-Q always runs the business logic rules on the most recent, current set of answers to the questions. As a result, there may be no TaskAnswer or TaskAnswerHistory records for questions that have been imputed.

In certain circumstances, a question’s answer may be imputed after a user already provided an answer to the question. In such cases, the user’s answer remains in the database and appears in the database as the current answer to the question. However, when visiting GovReady-Q, imputed answers supersede user answers and only the imputed value will be used.

Output Documents¶

Compliance apps produce output documents. Each output document is generated by combining a template with the answers to questions (both user-inputted and imputed). Templates are typically written using Markdown syntax and are displayed in GovReady-Q as HTML documents, but they typically can also be downloaded in other formats such as a Microsoft Word document or PDF.

As with imputation, output documents are generated on-the-fly when they are viewed by GovReady-Q users. The generated documents are not stored in the database because they are computed on-the-fly to ensure that the GovReady-Q always runs the template on the most recent, current set of answers to the Module’s questions. As a result, there is no database table for output documents.

Output Document Special Values¶

As of version 0.9.1.22.3, GovReady-Q makes the following special objects available to Output Documents.

{{ control_catalog }}

The control_catalog object provides access to a Catalog of Controls such as the NIST 800-53 and NIST 800-171. The control_catalog object is a flattened, simplified dictionary of the controls sourced from the OSCAL version of the control id as the key (e.g., “AC-2 (2)” is ac.2.2).

When you pass a list of control ids into an Output Document, you can use the control ids to access the text of the controls from the control_catalog object.

The object has the following attributes:

id - the OSCAL-formatted control ID

id_display - the commonly used control ID format

title - title of the control

family_id - the ID of the control group or family to which the control belongs

family_title - The title of the control group or family to which the control belongs

class - the class of the control

description - the combined description of the guidance of the control in markdown format

guidance - the discussion of guidance for the control in markdown format

catalog_file - the name of the catalog file used to generate the control catalog

catalog_id - the UUID of the control catalog

The below example demonstrates using the control_catalog object to generate a heading for a control family’s ID and Title. Notice how the control value, passed in from the view method, is set to lowercase to conform to the OSCAL ID format.

{{control_catalog[control.lower()]['family_id']|upper}} - {{control_catalog[control.lower()]['family_title']}}

The data for Control Catalogs can be found in the controls/catalogs directory.

{{ system }}

The system object provides access to additional information stored in the database about the Information System associated with the project (if one exists). Exactly one Information System is directly associated with a project.

The system object provides access to the component-to-control implementation statements that we began storing as distinct database objects in version 0.9.1.5 of GovReady-Q. We decided to additionally represent the control implementations statements in the database to help teams author control implementation statements by individual system components to support compliance as code.

By using the control_catalog and the system object it is possible in documents to express both authoritative control descriptions and the control implementation statements of individual systems.

As GovReady-Q builds out more functionality, the system object will be the gateway for accessing that information in the Output Documents.

The object has the following attributes:

root_element - the actual IT assess (element) that is the Information System

fisma_id - The FISMA ID of the system

control_implementation_as_dict - a dictionary of objects keyed to an OSCAL-formatted control ID where each object contains an array for control implementation statements for that control, an array of common control statements for that control, and a combined statement. The object structure is:

{
  "au-2": {
            "control_impl_smts": [smt_obj_1, smt_obj_2],
            "common_controls": [common_control_obj_1, common_control_obj_2],
            "combined_smt": "Very long text combining statements into a single string..."
          },
  "au-3": {
            "control_impl_smts": [smt_obj_3, smt_obj_4, ...],
            "common_controls": [],
            "combined_smt": "Very long text combining statements into a single string..."
          },
  ...
}

For more information examine the controls.models.System source code.

Database Query Examples¶

Example: Find all approved answers to a particular question across users and tasks¶

Scenario: Unix File Server App contains a text-type question named “Hostname”. Many users have finished answering all of the questions in the app. However, our reviewers have only approved some of the answers so far. I want to write an SQL query to return all approved answers to the “Hostname” question.

In this section, we will build up an SQL query to extract the data identified in the scenario. The query will be built progressively over the next several sections to explain the rationale behind the GovReady-Q data model. Some of GovReady-Q’s design choices — including separating the definitions of compliance apps from user-submitted data, as well as recording an immutable history of user answers — are reflected in the SQL queries below. The complete SQL query is shown at the end.

You may prefer to use the GovReady-Q API instead of writing a low-level database query, but this example is illustrative for understanding GovReady-Q’s data model no matter which method you use to query the data.

Find the AppVersions¶

First locate the AppSource “slug” and AppVersion “appname” that identifies a compliance app in GovReady-Q’s database. Find the app in the compliance apps catalog and click its Info button:

compliance app¶

The slug and the appname of the compliance app can be found in the URL:

http://mygovreadyq/store/myapps/unix_file_server

In this case the slug is “myapps” and the appname is “unix_file_server”. These two fields identify the compliance app across its versions.

Construct an SQL query to return the numeric IDs of the AppVersions in the database for this compliance app. Each AppVersion may be a different version of the compliance app or a different instance of the same app in use by different users.

SELECT guidedmodules_appversion.id
FROM guidedmodules_appversion
LEFT JOIN guidedmodules_appsource
       ON guidedmodules_appsource.id = guidedmodules_appversion.source_id
WHERE guidedmodules_appsource.slug = "myapps"
  AND guidedmodules_appversion.appname = "unix_file_server";

This query will be adapted in the next section to find the hostname question.

Find the ModuleQuestions¶

Consult the compliance app source code YAML files to determine the “module_name” of the Module and “key” of the ModuleQuestion — which are in the “id” fields in the YAML file.

Module YAML¶

The Module containing the hostname question has “file_server” as its module_name, and the ModuleQuestion’s key is “hostname.”

Construct a preliminary SQL query to find all of the ModuleQuestion records for this question:

SELECT guidedmodules_modulequestion.id
FROM guidedmodules_modulequestion
LEFT JOIN guidedmodules_module
       ON guidedmodules_module.id = guidedmodules_modulequestion.module_id
WHERE guidedmodules_module.module_name = "file_server"
  AND guidedmodules_modulequestion.key = "hostname";

This query might be too broad — it does not restrict the questions to those defined in the Unix File Server compliance app. There might be other compliance apps that use the same module_name and question key. Combine the first two queries to ensure only questions in the Unix File Server app are returned using a LEFT JOIN to bridge the tables:

SELECT guidedmodules_modulequestion.id
FROM guidedmodules_modulequestion
LEFT JOIN guidedmodules_appsource
       ON guidedmodules_appsource.id = guidedmodules_appversion.source_id
LEFT JOIN guidedmodules_appversion
       ON guidedmodules_appversion.id = guidedmodules_module.app_id
LEFT JOIN guidedmodules_module
       ON guidedmodules_module.id = guidedmodules_modulequestion.module_id
WHERE guidedmodules_appsource.slug = "myapps"
  AND guidedmodules_appversion.appname = "unix_file_server"
  AND guidedmodules_module.module_name = "file_server"
  AND guidedmodules_modulequestion.key = "hostname";

We’ll call this query MODULE_QUESTIONS — we’ll use it as a sub-query in the next step.

Find the history of answers¶

GovReady-Q has been designed so that separate tables contain the definition of the question and the user-submitted answers to the question. Each answer is connected to a ModuleQuestion through a TaskAnswer. Locate the TaskAnswers for the questions:

SELECT guidedmodules_taskanswer.id
FROM guidedmodules_taskanswer
WHERE guidedmodules_taskanswer.question_id IN (MODULE_QUESTIONS);

Replace MODULE_QUESTIONS with the preceding SQL query, inserting it as a sub-query.

The TaskAnswer table does not hold user answers, however. Answers are stored in the TaskAnswerHistory table where the complete history of answers to questions are stored. We’ll now adapt the query to fetch the history of answers to this question, including some metadata about the answers, by using a LEFT JOIN to bridge the TaskAnswerHistory table and the TaskAnswer table:

SELECT guidedmodules_taskanswer.id, answer.stored_value, answer.created, siteapp_user.username
FROM guidedmodules_taskanswerhistory AS answer
LEFT JOIN guidedmodules_taskanswer
       ON guidedmodules_taskanswer.id = answer.taskanswer_id
LEFT JOIN siteapp_user
       ON siteapp_user.id = answer.answered_by_id
WHERE guidedmodules_taskanswer.question_id IN (MODULE_QUESTIONS);

Here is an example result:

Task Answer	Stored Value	Created	Username
10	“server1.company.com”	2018-05-19 20:33	user1
10	“server2.company.com”	2018-05-20 10:15	user1
10	null	2018-05-20 10:35	user1
11	“server2.company.com”	2018-05-19 16:20	user2

This is the complete history of answers for the “hostname” question in two separate Tasks, i.e. two instantiations of the compliance app started by different users. The two instantiations of the question are identified by their TaskAnswer “id”s, 10 and 11.

The history for TaskAnswer 10 has three rows. Two rows – the first two — reflect old answers to questions. This indicates the user returned to the question twice. On the first occasion, the user replaced the original answer with "server2.company.com". On the second revisit, the user replaced the original answer with null, clearing the answer because the user decided they didn’t know the answer or the question didn’t apply to them.

The second TaskAnswer was answered once.

We’ll adapt this query in the next step to fetch just the current (most recent) answer in each Task.

Find the current answer to each question¶

The current answer for each question is stored in the TaskAnswerHistory record with the highest “id” for each TaskAnswer. The IDs in the TaskAnswerHistory table are assigned strictly sequentially. To determine which TaskAnswerHistory record holds the current answer, use GROUP BY and max to fetch one TaskAnswerHistory for each TaskAnswer:

SELECT max(answer.id)
FROM guidedmodules_taskanswerhistory AS answer
LEFT JOIN guidedmodules_taskanswer
       ON guidedmodules_taskanswer.id = answer.taskanswer_id
LEFT JOIN siteapp_user
       ON siteapp_user.id = answer.answered_by_id
WHERE guidedmodules_taskanswer.question_id IN (MODULE_QUESTIONS)
GROUP BY guidedmodules_taskanswer.id;

id
103
104

This result holds the current answers to the hostname question. We’ll call this query CURRENT_ANSWERS — we’ll use it as a sub-query in the next query.

To fetch the answers and metadata but for the current answers, we’ll query the TaskAnswerHistory table using the CURRENT_ANSWERS query as a sub-query to identify just the rows that are current answers to questions:

SELECT taskanswer_id, stored_value, created, username, reviewed
FROM guidedmodules_taskanswerhistory
LEFT JOIN siteapp_user
       ON siteapp_user.id = answered_by_id
WHERE guidedmodules_taskanswerhistory.id IN (CURRENT_ANSWERS);

Here is an example result:

Task Answer	Stored Value	Created	Username	Reviewed
10	null	2018-05-20 10:35	user1	0
11	“server2.company.com”	2018-05-19 16:20	user2	2

This result holds the current answers to the Unix File Server hostname question across all instances of the compliance app in the GovReady-Q installation. Notice that the rows in the previous table that represented replaced answers to the first TaskAnswer are omitted from the results in this query and only the current answer for each Task is included.

The “stored_value” column holds the user’s answer encoded in JSON. In JSON, text (strings) are enclosed in double quotes. Therefore we know that the second answer is text. In JSON, null (without double quotes around it) represents an empty value — in GovReady-Q, that means the user skipped the question choosing I Don’t Know, It Doesn’t Apply, or I’ll Come Back.

We’ll modify this query in the next section to filter on the reviewed status of each answer.

Filter on approved answers¶

The “reviewed” field of TaskAnswerHistory stores GovReady-Q’s simple workflow status of the answer. The values are 0 (not reviewed), 1 (reviewed), and 2 (approved). To select just approved answers, add a WHERE clause to the previous SQL query:

SELECT taskanswer_id, stored_value, created, username, reviewed
FROM guidedmodules_taskanswerhistory
LEFT JOIN siteapp_user
       ON siteapp_user.id = answered_by_id
WHERE guidedmodules_taskanswerhistory.id IN (CURRENT_ANSWERS)
  AND reviewed = 2;

The query extracts the answers in a structure similar to the following table:

Task Answer	Stored Value	Created	Username	Reviewed
11	“server2.company.com”	2018-05-19 16:20	user2	2

This is the complete query to extract the approved answers to the hostname question in the Unix File Server compliance app. The query has been simplified by replacing a sub-query with CURRENT_ANSWERS, which itself has a sub-query that has been replaced by MODULE_QUESTIONS. Both sub-queries can be found above.