Definitions of words relating to machine learning, informaiton security and computer science.

The glossary is a work in-progress and is currently incomplete.

ctrl + f can be used to find specific words quickly.

0-9 A B C D E F G H I J K L M N



See Zero-day.


The fifth generation of wireless communications technologies supporting cellular/mobile data networks. 5G is considered disruptive in combination with the Internet of Things (IoT); increased bandwidth and low latency will allow devices to adopt wireless connections where they were not able to with 4G.


A wireless protocol standard created by IEEE. As new standards are released the alphabetic character(s) at the end change, e.g., 802.11a, 802.11b.

Back to Top



The act of representing essential features without including the background details or explanations. For example, all computer programs ultimately store and manipulate data using 1s and 0s (machine code), this complexity is removed - or abstracted away - by assembly lanuage. Turning a very long string of binary digits into a more human-readable instruction, such as mov (move). This is still complex and unweidly, so programming languages such as C abstract and build on these instructions. Modern programming languages (e.g. Java, Python, C#) go furhter still, abstracting away the underlying mechanics with easy to use functions and closer-to-English syntax.


Also called threat actor or malicious actor. An entity that is responsible for an event or incident that impacts, or has the potential to impact, the safety or security of another entity or system. Can be an individual or group.


A threat actor, typically a nation state or state-sponsored group, which gains unauthorized access to a computer network and remains undetected for an extended period. The tools used (often zero-days) and scale of resources they have access to indicates the "advanced" nature.

Artificial Intelligence (AI)

The theory and development of computer systems able to perform tasks normally requiring human intelligence. The term is widely misused and misunderstood. There are two types of AI, soft AI and hard AI.

Back to Top



The basic unit of computing. A binary digit representing one of two states, 1 or 0.


A unit of informaiton consisting of eight bits.

Back to Top



Comma Seperated Value (CSV)

Back to Top


Data mining

A technique for discovering patterns in large data sets involving methods at the intersection of machine learning, statistics and database systems.

Back to Top



A piece of software or a sequence of commands that takes advantage of a vulnerability to cause unintended or unanticipated behavior to occur in computer software or hardware.

Back to Top


Back to Top




A multiple of the unit byte. A gigabyte (GB) is a measure of storage capacity equal to approximately 1,000 megabytes (GB) or a million kilobytes (KB).

Back to Top


Hash function

An algorithm performed on data (such as a file) produce a unique number called a hash. Hashes can be used to compare files, for example to determine if a file is the same signiture as a known malicious file.


Software that creates and runs virtual machines. May also be used to refer to the machine running the software.

Back to Top


Institute of Electrical and Electronics Engineers (IEEE)

A professional organisation for electrical and electronics engineers. IEEE publishes the largest amount of research literature relating to electronic and electrical engineering and computer science in the world in its journals.

Internet Protocol

The principal communications protocol in the Internet protocol suite for relaying data across network boundaries.

Internet Protocol Suite

The conceptual model and set of communications protocols used in the internet and similar computer networks. Also known as "TCP/IP" due to the foundational protocols in the suite; Transmission Control Protocol (TCP) and Internet Protocol (IP).

IP address

An Internet Protocol address (IP address) is a numerical label assigned to each device connected to a computer network that uses the Internet Protocol for communication. IP addresses identify the host or network interface and its location on the network.

Infrastructure-as-a-Service (IaaS)

A form of cloud service providing infrastructure such as computing resources, routing and scaling, among others. Services allow companies to create networks, gain computing power and entire software platforms at the click of a button, rather than expensive physical infrastrucutre and the associated design and deployment. IaaS is one of three as-a-serivce models, along with SaaS and PaaS.

International Organization for Standardization (ISO)

An international standard-setting body composed of representatives from various national standards organizations. ISO is not an acronym, it is derived from the Greek isos, meaning equal, to make a unified short name between languages. For an example ISO standard, see ISO 27001.


A global system of interconnected computer systems. Distinct from the World Wide Web.

Internet of Things (IoT)

The collection of computing devices embedded in everyday objects connected over the internet, enabling them to send and receive data. For example, fridges, wearables and sensors. Typically, any device which would not normally be able to send an recieve data via the internet and its connection adds new functionality.

Back to Top


Back to Top



The software at the core of a computer's operating system which controls everything. Kernel code is loaded into a separate area of memory, which is protected from access by application programs or other, less critical parts of the operating system. The kernel performs tasks such as running processes and managing hardware devices in this protected kernel space. In contrast, application programs like browsers and word processors use a separate area of memory called user space.


A multiple of the unit byte. A kilobyte (kB) is a measure of storage capacity equal to approximately 1,000 bytes.

Back to Top


Local Area Network (LAN)

Refers to a low-level of abstraction. For example, "Low-level programming language", meaning the language interacts directly with hardware (e.g. C), or runs directly on the hardware (e.g. assembly or machine code). See abstraction for more.


A family of Unix-like operating systems based on the open-source Linux kernal, first released in 1991 by Linus Torvalds. Popular Linux versions (also called distributions) are Ubuntu, Arch Linux and Fedora, however, there are hundreds of versions. Some are designed for specific purposes; scientific analysis, digital forensics, penetration testing, etc.

Back to Top


Machine learning

A method of data analysis that automates analytical model building, based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention. It is a subset of artificial intelligence.


Software used or created by malicious actors to disrupt or damage computer operation, gather sensitive information, or gain access to private computer systems. Short for ‘malicious software’.


A multiple of the unit byte. A megabyte (MB) is a measure of storage capacity equal to approximately 1,000 kilobytes (kB) or a million bytes.

Back to Top




The ability to prove that a specific individual has carried out an activity on a computer or online, so that it cannot later be denied.

Back to Top


One-way function

A function that is hard to invert. For example, a cryptographic hash function is one-way because it is easy (fast) to compute, but difficult (extremely slow / essentially impossible) to reverse without the key, creating a method to secure data.

OSI model

The Open System Interconnection (OSI) is an internatinal ISO technical standard which defines a framework of seven layers used for computer networking referred to as the OSI model.

Open-source software

Software in which source code is released under a license in which grants users the rights to study, change, and distribute the software to anyone and for any purpose. Such software is often developed in a collaborative public manner.

Open-source intelligence (OSINT)

Data collected from publicly available sources to be used in an intelligence context. For example, Twitter, Facebook and Instagram.

Back to Top




A multiple of the unit byte. "Peta" indicates multiplication by the fifth power of 1000 or 1015. A petabyte (PB) is a measure of storage capacity equal to approximately 1,000 terabytes (TB), a million gigabytes (GB) or a billion megabytes (MB).

Platform-as-a-Service (PaaS)


See communication protocol.

Back to Top


No "Q" entries at this time.

Back to Top


Repository (Repo)

Back to Top



Social engineering

Soft AI


See technical standard.


A nation state who finances another entity. This tactic is often used to create deniability and confusion.


A 'fingerprint' representing the characteristics of a virus or malware, or its type. Internet security software uses a database of signatures to detect viruses and malware. Signitures are often stored as hashes.

Software-as-a-Service (SaaS)

A software licensing and delivery model in which software is licensed on a subscription basis and is centrally hosted in the cloud. Examples, include Google Apps - Google Docs, Google Sheets, etc.


Back to Top




See Internet Protocol suite.

Technical standard

An established norm or requirement for a repeatable technical task. Typically written in a formal document that establishes uniform engineering or technical criteria, methods, processes, and practices. Also see ISO.


A multiple of the unit byte. "Tera" indicates multiplication by the fourth power of 1000 or 1012. A terabyte (TB) is a measure of storage capacity equal to approximately 1,000 gigabytes (GB), a million megabytes (MB) or a billion kilobytes (KB).


Back to Top



User Datagram Protocol (UDP) is one of the core components of the Internet protocol suite.

Universal Resource Identifier (URI)

A string of characters that identifies a particular resource by name and its location. URIs can represent names and not location and vice-versa. Also see URL (Universal Resource Locator).

Universal Resource Locator (URL)

A type of Uniform Resource Identifier (URI) that identifies the resources network location, typically on the internet. URLs only represent a resources location. For example, is the location of the main page of Wikipedia. DNS (Domain Name System) is used to parse URLs.


A family of operating systems that derive from the original AT&T Unix released in 1971. Unix systems are characterized by a modular design and other design philosphy which differ from other operating systems.

Back to Top


Version Control System


A hidden (the user is unaware), self-replicating malware, that propagates by infecting, i.e., inserting a copy of itself into and becoming part of another program. A virus requires its host program to be executed to active. See worm.


The practice of attempting to obtain personal or financial information via a telephone call in order to commit fraud or identity theft.


Refering to the act of creating a virtual version of something, for example virtual computers, storage devices, and computer network resources.

Virtual Machine (VM)

A virtual computer run using software, i.e., a computer run on another computer. VMs are based on computer architectures (Window,s Linux, etc) and provide functionality of a physical computer. This allows functionality beyond what a single machine can achieve.


A flaw or weakness that can be used to attack a system or organization.

Back to Top



Malware which executes independently, propagating onto other host computer systems, and may consume large amounts of resources until a system becomes unresponsive. Also see virus.

Back to Top


No "X" entries at this time.

Back to Top



A multiple of the unit byte. "Yotta" indicates multiplication by the eighth power of 1000 or 1024. A yottabyte (YB) is a measure of storage capacity equal approximately 1,000 zettabytes (ZB), a trillion terabytes (TB) or a million trillion megabytes (MB).

Back to Top



A software vulnerability that is unknown to the software creator, thus is unlikely to be patched.

Back to Top

UGR16 Dataset