Database with malware hashes

n1d0 · Apr 3, 2024

Hi everyone, I'm working on a cybersecurity project, and part of it involves comparing files to malware samples to determine if the scanned file is malware.

Therefore I need a database that contains malware hashes to perform this step.

If anyone knows of a site that contains something similar to what I require, I would appreciate it.

Trident · Apr 3, 2024

Just malware hashes, you must use an API such as bazaar or VT API.

MalwareBazaar | API

API documentation

bazaar.abuse.ch

https://docs.virustotal.com/reference/overview

Code:

 tip.kaspersky.com/Help/Doc_data/en-US/ThreatLookupAPI.htm

Some of these APIs are not free, subscriptions are required as threat intelligence, specially when curated and properly checked for FPs is not free.
Many AV vendors offer APIs as well.
The Bazaar API should be free but it is plagued with false positives.
You implement it via HTTP Post request like this:

Code:

wget --post-data "query=get_info&hash=7de2c1bf58bce09eecc70476747d88a26163c3d6bb1d85235c24a558d1f16754" https://mb-api.abuse.ch/api/v1/

In addition, the Sophos Sorel collection contains 20 million samples you can use to train ML models, be advised that you will also need a large number of safe files for false positives control.

Sophos-ReversingLabs (SOREL) 20 Million sample malware dataset | Sophos AI

The Sophos AI team is excited to announce the release of SOREL-20M (Sophos-ReversingLabs – 20 million) – a production-scale dataset […]

ai.sophos.com

I also found this, that contains more APIs, supposedly open source.

threat-hunting-with-notebooks/Open Source Threat Intel lookup using Requests API.ipynb at master · ashwin-patil/threat-hunting-with-notebooks

Repository with Sample threat hunting notebooks on Security Event Log Data Sources - ashwin-patil/threat-hunting-with-notebooks

github.com

GitHub - jaegeral/security-apis: A collective list of public APIs for use in security. Contributions welcome

A collective list of public APIs for use in security. Contributions welcome - jaegeral/security-apis

github.com

GitHub - keithjjones/cuckoo-api: API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org

API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org - keithjjones/cuckoo-api

github.com

B-boy/StyLe/ · Apr 4, 2024

Too bad that Malc0de, MDL and clean-mx are no more. But the S!Ri site is still there:

VX Vault

Edit: I forgot about Malwshare:

MalShare

The MalShare Project is a community driven public malware repository that works to provide free access to malware samples and tooling to the infomation security community.

malshare.com

Kongo · Apr 4, 2024

B-boy/StyLe/ said:
Too bad that Malc0de, MDL and clean-mx are no more. But the S!Ri site is still there:

VX Vault

Edit: I forgot about Malwshare:

MalShare

The MalShare Project is a community driven public malware repository that works to provide free access to malware samples and tooling to the infomation security community.

malshare.com

I also used VxVault years ago. But now you need to apply to get access and I don't even know where

n1d0 · Apr 4, 2024

Trident said:
Just malware hashes, you must use an API such as bazaar or VT API.

MalwareBazaar | API

API documentation

bazaar.abuse.ch

https://docs.virustotal.com/reference/overview

Code:

tip.kaspersky.com/Help/Doc_data/en-US/ThreatLookupAPI.htm

Some of these APIs are not free, subscriptions are required as threat intelligence, specially when curated and properly checked for FPs is not free.
Many AV vendors offer APIs as well.
The Bazaar API should be free but it is plagued with false positives.
You implement it via HTTP Post request like this:

Code:

wget --post-data "query=get_info&hash=7de2c1bf58bce09eecc70476747d88a26163c3d6bb1d85235c24a558d1f16754" https://mb-api.abuse.ch/api/v1/

In addition, the Sophos Sorel collection contains 20 million samples you can use to train ML models, be advised that you will also need a large number of safe files for false positives control.

Sophos-ReversingLabs (SOREL) 20 Million sample malware dataset | Sophos AI

The Sophos AI team is excited to announce the release of SOREL-20M (Sophos-ReversingLabs – 20 million) – a production-scale dataset […]

ai.sophos.com

I also found this, that contains more APIs, supposedly open source.

threat-hunting-with-notebooks/Open Source Threat Intel lookup using Requests API.ipynb at master · ashwin-patil/threat-hunting-with-notebooks

Repository with Sample threat hunting notebooks on Security Event Log Data Sources - ashwin-patil/threat-hunting-with-notebooks

github.com

GitHub - jaegeral/security-apis: A collective list of public APIs for use in security. Contributions welcome

A collective list of public APIs for use in security. Contributions welcome - jaegeral/security-apis

github.com

GitHub - keithjjones/cuckoo-api: API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org

API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org - keithjjones/cuckoo-api

github.com

Hello, Trident.

The Malware BaaZar website contains the information I am looking for. Thank you very much also for the documentation that you shared, it may be useful to me.

B-boy/StyLe/ · Apr 4, 2024

Kongo said:
I also used VxVault years ago. But now you need to apply to get access and I don't even know where

That's true. However, the hashes are still visible (and that's what the OP asked for if I am not mistaken).
Also, the VT links are visible as well, and most malware researchers can download malware samples directly from VT.
So VxVault can still be useful.

Search

Database with malware hashes

n1d0

New Member

Trident

Level 34

MalwareBazaar | API

Sophos-ReversingLabs (SOREL) 20 Million sample malware dataset | Sophos AI

threat-hunting-with-notebooks/Open Source Threat Intel lookup using Requests API.ipynb at master · ashwin-patil/threat-hunting-with-notebooks

GitHub - jaegeral/security-apis: A collective list of public APIs for use in security. Contributions welcome

GitHub - keithjjones/cuckoo-api: API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org

B-boy/StyLe/

Level 3

MalShare

Kongo

Level 36

MalShare

n1d0

New Member

MalwareBazaar | API

Sophos-ReversingLabs (SOREL) 20 Million sample malware dataset | Sophos AI

threat-hunting-with-notebooks/Open Source Threat Intel lookup using Requests API.ipynb at master · ashwin-patil/threat-hunting-with-notebooks

GitHub - jaegeral/security-apis: A collective list of public APIs for use in security. Contributions welcome

GitHub - keithjjones/cuckoo-api: API for Cuckoo Malware Analysis Sandbox http://www.cuckoosandbox.org

B-boy/StyLe/

Level 3

Similar threads