Important Artifacts In Windows-I

This chapter will explain various concepts involved in Microsoft Windows forensics and the important artifacts that an investigator can obtain from the investigation process.

Introduction

Artifacts are the objects or areas within a computer system that have important information related to the activities performed by the computer user. The type and location of this information depends upon the operating system. During forensic analysis, these artifacts play a very important role in approving or disapproving the investigator’s observation.

Importance of Windows Artifacts for Forensics

Windows artifacts assume significance due to the following reasons −

Around 90% of the traffic in world comes from the computers using Windows as their operating system. That is why for digital forensics examiners Windows artifacts are very essentials.
The Windows operating system stores different types of evidences related to the user activity on computer system. This is another reason which shows the importance of Windows artifacts for digital forensics.
Many times the investigator revolves the investigation around old and traditional areas like user crated data. Windows artifacts can lead the investigation towards non-traditional areas like system created data or the artifacts.
Great abundance of artifacts is provided by Windows which are helpful for investigators as well as for companies and individuals performing informal investigations.
Increase in cyber-crime in recent years is another reason that Windows artifacts are important.

Windows Artifacts and their Python Scripts

In this section, we are going to discuss about some Windows artifacts and Python scripts to fetch information from them.

Recycle Bin

It is one of the important Windows artifacts for forensic investigation. Windows recycle bin contains the files that have been deleted by the user, but not physically removed by the system yet. Even if the user completely removes the file from system, it serves as an important source of investigation. This is because the examiner can extract valuable information, like original file path as well as time that it was sent to Recycle Bin, from the deleted files.

Note that the storage of Recycle Bin evidence depends upon the version of Windows. In the following Python script, we are going to deal with Windows 7 where it creates two files: $R file that contains the actual content of the recycled file and $I file that contains original file name, path, file size when file was deleted.

For Python script we need to install third party modules namely pytsk3, pyewf and unicodecsv. We can use pip to install them. We can follow the following steps to extract information from Recycle Bin −

First, we need to use recursive method to scan through the $Recycle.bin folder and select all the files starting with $I.
Next, we will read the contents of the files and parse the available metadata structures.
Now, we will search for the associated $R file.
At last, we will write the results into CSV file for review.

Let us see how to use Python code for this purpose −

First, we need to import the following Python libraries −

from __future__ import print_function
from argparse import ArgumentParser

import datetime
import os
import struct

from utility.pytskutil import TSKUtil
import unicodecsv as csv

Next, we need to provide argument for command-line handler. Note that here it will accept three arguments – first is the path to evidence file, second is the type of evidence file and third is the desired output path to the CSV report, as shown below −

if __name__ == '__main__':
   parser = argparse.ArgumentParser('Recycle Bin evidences')
   parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file")
   parser.add_argument('IMAGE_TYPE', help = "Evidence file format",
   choices = ('ewf', 'raw'))
   parser.add_argument('CSV_REPORT', help = "Path to CSV report")
   args = parser.parse_args()
   main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.CSV_REPORT)

Now, define the main() function that will handle all the processing. It will search for $I file as follows −

def main(evidence, image_type, report_file):
   tsk_util = TSKUtil(evidence, image_type)
   dollar_i_files = tsk_util.recurse_files("$I", path = '/$Recycle.bin',logic = "startswith")
   
   if dollar_i_files is not None:
      processed_files = process_dollar_i(tsk_util, dollar_i_files)
      write_csv(report_file,['file_path', 'file_size', 'deleted_time','dollar_i_file', 'dollar_r_file', 'is_directory'],processed_files)
   else:
      print("No $I files found")

Now, if we found $I file, then it must be sent to process_dollar_i() function which will accept the tsk_util object as well as the list of $I files, as shown below −

def process_dollar_i(tsk_util, dollar_i_files):
   processed_files = []
   
   for dollar_i in dollar_i_files:
      file_attribs = read_dollar_i(dollar_i[2])
      if file_attribs is None:
         continue
      file_attribs['dollar_i_file'] = os.path.join('/$Recycle.bin', dollar_i[1][1:])

Now, search for $R files as follows −

recycle_file_path = os.path.join('/$Recycle.bin',dollar_i[1].rsplit("/", 1)[0][1:])
dollar_r_files = tsk_util.recurse_files(
   "$R" + dollar_i[0][2:],path = recycle_file_path, logic = "startswith")
   
   if dollar_r_files is None:
      dollar_r_dir = os.path.join(recycle_file_path,"$R" + dollar_i[0][2:])
      dollar_r_dirs = tsk_util.query_directory(dollar_r_dir)
   
   if dollar_r_dirs is None:
      file_attribs['dollar_r_file'] = "Not Found"
      file_attribs['is_directory'] = 'Unknown'
   
   else:
      file_attribs['dollar_r_file'] = dollar_r_dir
      file_attribs['is_directory'] = True
   
   else:
      dollar_r = [os.path.join(recycle_file_path, r[1][1:])for r in dollar_r_files]
      file_attribs['dollar_r_file'] = ";".join(dollar_r)
      file_attribs['is_directory'] = False
      processed_files.append(file_attribs)
   return processed_files

Now, define read_dollar_i() method to read the $I files, in other words, parse the metadata. We will use read_random() method to read the signature’s first eight bytes. This will return none if signature does not match. After that, we will have to read and unpack the values from $I file if that is a valid file.

def read_dollar_i(file_obj):
   if file_obj.read_random(0, 8) != '\x01\x00\x00\x00\x00\x00\x00\x00':
      return None
   raw_file_size = struct.unpack('<q', file_obj.read_random(8, 8))
   raw_deleted_time = struct.unpack('<q',   file_obj.read_random(16, 8))
   raw_file_path = file_obj.read_random(24, 520)

Now, after extracting these files we need to interpret the integers into human-readable values by using sizeof_fmt() function as shown below −

file_size = sizeof_fmt(raw_file_size[0])
deleted_time = parse_windows_filetime(raw_deleted_time[0])

file_path = raw_file_path.decode("utf16").strip("\x00")
return {'file_size': file_size, 'file_path': file_path,'deleted_time': deleted_time}

Now, we need to define sizeof_fmt() function as follows −

def sizeof_fmt(num, suffix = 'B'):
   for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']:
      if abs(num) < 1024.0:
         return "%3.1f%s%s" % (num, unit, suffix)
      num /= 1024.0
   return "%.1f%s%s" % (num, 'Yi', suffix)

Now, define a function for interpreted integers into formatted date and time as follows −

def parse_windows_filetime(date_value):
   microseconds = float(date_value) / 10
   ts = datetime.datetime(1601, 1, 1) + datetime.timedelta(
      microseconds = microseconds)
   return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

Now, we will define write_csv() method to write the processed results into a CSV file as follows −

def write_csv(outfile, fieldnames, data):
   with open(outfile, 'wb') as open_outfile:
      csvfile = csv.DictWriter(open_outfile, fieldnames)
      csvfile.writeheader()
      csvfile.writerows(data)

When you run the above script, we will get the data from $I and $R file.

Sticky Notes

Windows Sticky Notes replaces the real world habit of writing with pen and paper. These notes used to float on the desktop with different options for colors, fonts etc. In Windows 7 the Sticky Notes file is stored as an OLE file hence in the following Python script we will investigate this OLE file to extract metadata from Sticky Notes.

For this Python script, we need to install third party modules namely olefile, pytsk3, pyewf and unicodecsv. We can use the command pip to install them.

We can follow the steps discussed below for extracting the information from Sticky note file namely StickyNote.sn −

Firstly, open the evidence file and find all the StickyNote.snt files.
Then, parse the metadata and content from the OLE stream and write the RTF content to files.
Lastly, create CSV report of this metadata.

Python Code

Let us see how to use Python code for this purpose −

First, import the following Python libraries −

from __future__ import print_function
from argparse import ArgumentParser

import unicodecsv as csv
import os
import StringIO

from utility.pytskutil import TSKUtil
import olefile

Next, define a global variable which will be used across this script −

REPORT_COLS = ['note_id', 'created', 'modified', 'note_text', 'note_file']

if __name__ == '__main__':
   parser = argparse.ArgumentParser('Evidence from Sticky Notes')
   parser.add_argument('EVIDENCE_FILE', help="Path to evidence file")
   parser.add_argument('IMAGE_TYPE', help="Evidence file format",choices=('ewf', 'raw'))
   parser.add_argument('REPORT_FOLDER', help="Path to report folder")
   args = parser.parse_args()
   main(args.EVIDENCE_FILE, args.IMAGE_TYPE, args.REPORT_FOLDER)

Now, we will define main() function which will be similar to the previous script as shown below −

def main(evidence, image_type, report_folder):
   tsk_util = TSKUtil(evidence, image_type)
   note_files = tsk_util.recurse_files('StickyNotes.snt', '/Users','equals')

Now, let us iterate through the resulting files. Then we will call parse_snt_file() function to process the file and then we will write RTF file with the write_note_rtf() method as follows −

report_details = []
for note_file in note_files:
   user_dir = note_file[1].split("/")[1]
   file_like_obj = create_file_like_obj(note_file[2])
   note_data = parse_snt_file(file_like_obj)
   
   if note_data is None:
      continue
   write_note_rtf(note_data, os.path.join(report_folder, user_dir))
   report_details += prep_note_report(note_data, REPORT_COLS,"/Users" + note_file[1])
   write_csv(os.path.join(report_folder, 'sticky_notes.csv'), REPORT_COLS,report_details)

Next, we need to define various functions used in this script.

First of all we will define create_file_like_obj() function for reading the size of the file by taking pytsk file object. Then we will define parse_snt_file() function that will accept the file-like object as its input and is used to read and interpret the sticky note file.

def parse_snt_file(snt_file):
   
   if not olefile.isOleFile(snt_file):
      print("This is not an OLE file")
      return None
   ole = olefile.OleFileIO(snt_file)
   note = {}
   
   for stream in ole.listdir():
      if stream[0].count("-") == 3:
         if stream[0] not in note:
            note[stream[0]] = {"created": ole.getctime(stream[0]),"modified": ole.getmtime(stream[0])}
         content = None
         if stream[1] == '0':
            content = ole.openstream(stream).read()
         elif stream[1] == '3':
            content = ole.openstream(stream).read().decode("utf-16")
         if content:
            note[stream[0]][stream[1]] = content
	return note

Now, create a RTF file by defining write_note_rtf() function as follows

def write_note_rtf(note_data, report_folder):
   if not os.path.exists(report_folder):
      os.makedirs(report_folder)
   
   for note_id, stream_data in note_data.items():
      fname = os.path.join(report_folder, note_id + ".rtf")
      with open(fname, 'w') as open_file:
         open_file.write(stream_data['0'])

Now, we will translate the nested dictionary into a flat list of dictionaries that are more appropriate for a CSV spreadsheet. It will be done by defining prep_note_report() function. Lastly, we will define write_csv() function.

def prep_note_report(note_data, report_cols, note_file):
   report_details = []
   
   for note_id, stream_data in note_data.items():
      report_details.append({
         "note_id": note_id,
         "created": stream_data['created'],
         "modified": stream_data['modified'],
         "note_text": stream_data['3'].strip("\x00"),
         "note_file": note_file
      })
   return report_details
def write_csv(outfile, fieldnames, data):
   with open(outfile, 'wb') as open_outfile:
      csvfile = csv.DictWriter(open_outfile, fieldnames)
      csvfile.writeheader()
      csvfile.writerows(data)

After running the above script, we will get the metadata from Sticky Notes file.

Registry Files

Windows registry files contain many important details which are like a treasure trove of information for a forensic analyst. It is a hierarchical database that contains details related to operating system configuration, user activity, software installation etc. In the following Python script we are going to access common baseline information from the SYSTEM and SOFTWARE hives.

For this Python script, we need to install third party modules namely pytsk3, pyewf and registry. We can use pip to install them.

We can follow the steps given below for extracting the information from Windows registry −

First, find registry hives to process by its name as well as by path.
Then we to open these files by using StringIO and Registry modules.
At last we need to process each and every hive and print the parsed values to the console for interpretation.

Python Code

Let us see how to use Python code for this purpose −

First, import the following Python libraries −

from __future__ import print_function
from argparse import ArgumentParser

import datetime
import StringIO
import struct

from utility.pytskutil import TSKUtil
from Registry import Registry

Now, provide argument for the command-line handler. Here it will accept two arguments - first is the path to the evidence file, second is the type of evidence file, as shown below −

if __name__ == '__main__':
   parser = argparse.ArgumentParser('Evidence from Windows Registry')
   parser.add_argument('EVIDENCE_FILE', help = "Path to evidence file")
   parser.add_argument('IMAGE_TYPE', help = "Evidence file format",
   choices = ('ewf', 'raw'))
   args = parser.parse_args()
   main(args.EVIDENCE_FILE, args.IMAGE_TYPE)

Now we will define main() function for searching SYSTEM and SOFTWARE hives within /Windows/System32/config folder as follows −

def main(evidence, image_type):
   tsk_util = TSKUtil(evidence, image_type)
   tsk_system_hive = tsk_util.recurse_files('system', '/Windows/system32/config', 'equals')
   tsk_software_hive = tsk_util.recurse_files('software', '/Windows/system32/config', 'equals')
   system_hive = open_file_as_reg(tsk_system_hive[0][2])
   software_hive = open_file_as_reg(tsk_software_hive[0][2])
   process_system_hive(system_hive)
   process_software_hive(software_hive)

Now, define the function for opening the registry file. For this purpose, we need to gather the size of file from pytsk metadata as follows −

def open_file_as_reg(reg_file):
   file_size = reg_file.info.meta.size
   file_content = reg_file.read_random(0, file_size)
   file_like_obj = StringIO.StringIO(file_content)
   return Registry.Registry(file_like_obj)

Now, with the help of following method, we can process SYSTEM> hive −

def process_system_hive(hive):
   root = hive.root()
   current_control_set = root.find_key("Select").value("Current").value()
   control_set = root.find_key("ControlSet{:03d}".format(current_control_set))
   raw_shutdown_time = struct.unpack(
      '<Q', control_set.find_key("Control").find_key("Windows").value("ShutdownTime").value())
   
   shutdown_time = parse_windows_filetime(raw_shutdown_time[0])
   print("Last Shutdown Time: {}".format(shutdown_time))
   
   time_zone = control_set.find_key("Control").find_key("TimeZoneInformation")
      .value("TimeZoneKeyName").value()
   
   print("Machine Time Zone: {}".format(time_zone))
   computer_name = control_set.find_key("Control").find_key("ComputerName").find_key("ComputerName")
      .value("ComputerName").value()
   
   print("Machine Name: {}".format(computer_name))
   last_access = control_set.find_key("Control").find_key("FileSystem")
      .value("NtfsDisableLastAccessUpdate").value()
   last_access = "Disabled" if last_access == 1 else "enabled"
   print("Last Access Updates: {}".format(last_access))

Now, we need to define a function for interpreted integers into formatted date and time as follows −

def parse_windows_filetime(date_value):
   microseconds = float(date_value) / 10
   ts = datetime.datetime(1601, 1, 1) + datetime.timedelta(microseconds = microseconds)
   return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

def parse_unix_epoch(date_value):
   ts = datetime.datetime.fromtimestamp(date_value)
   return ts.strftime('%Y-%m-%d %H:%M:%S.%f')

Now with the help of following method we can process SOFTWARE hive −

def process_software_hive(hive):
   root = hive.root()
   nt_curr_ver = root.find_key("Microsoft").find_key("Windows NT")
      .find_key("CurrentVersion")
   
   print("Product name: {}".format(nt_curr_ver.value("ProductName").value()))
   print("CSD Version: {}".format(nt_curr_ver.value("CSDVersion").value()))
   print("Current Build: {}".format(nt_curr_ver.value("CurrentBuild").value()))
   print("Registered Owner: {}".format(nt_curr_ver.value("RegisteredOwner").value()))
   print("Registered Org: 
      {}".format(nt_curr_ver.value("RegisteredOrganization").value()))
   
   raw_install_date = nt_curr_ver.value("InstallDate").value()
   install_date = parse_unix_epoch(raw_install_date)
   print("Installation Date: {}".format(install_date))

After running the above script, we will get the metadata stored in Windows Registry files.