These tables summarize guidelines for designing prudent malware experiments. For better visibility, we structured these guidelines into four categories: correct datasets, transparency, realism, and safety.

Legend: is a must, should be done, and is nice to have.

A. Correct Datasets

A. 1) Check if goodware samples should be removed from datasets.
A. 2) Balance datasets over malware families.
A. 3) Check whether training and evaluation datasets should have distinct families.
A. 4) Perform analysis with higher privileges than the malware's.
A. 5) Discuss and if necessary mitigate analysis artifacts and biases.
A. 6) Use caution when blending malware activity traces into benign background activity.

B. Transparency

B. 1) State family names of employed malware samples.
B. 2) List which malware was analyzed when.
B. 3) Explain the malware sample selection.
B. 4) Mention the system used during execution.
B. 5) Describe the network connectivity of the analysis environment.
B. 6) Analyze the reasons for false positives and false negatives.
B. 7) Analyze the nature/diversity of true positives.

C. Realism

C. 1) Evaluate relevant malware families.
C. 2) Perform real-world evaluations.
C. 3) Exercise caution generalizing from a single OS version, such as Windows XP.
C. 4) Choose appropriate malware stimuli.
C. 5) Consider allowing Internet access to malware.

D. Safety

D. 1) Deploy and describe containment policies.