MalShare Java API client 1.2-stable release

The main page for this API client can be found here. MalShare is a free initiative for researchers to share malware samples for research purposes, which can be accessed via the website and via the API. The code can be found on Github, along with the latest release. This update contains information on breaking changes, newly added functions, increased usability, and architectural changes.

Table of contents

Breaking changes

This update introduces a HTTP status code check for all requests that are made. If a status code is returned that is less than 100, equal to or more than 400, an IOException will be thrown. This should avoid the return of empty objects when a request fails, as these exceptions should be handled in the caller function(s).

Several functions now throw this exception when a user controlled value is not known by MalShare. To avoid unwanted surprises, the following existing functions with user controlled arguments are affected:

public byte[] getFile(String hash) throws IOException
public Map<String, Byte[]> getFiles(List<String> hashes, boolean suppressUnknownSamples) throws IOException
public MalShareFileDetails getFileDetails(String hash) throws IOException
public String getDownloadTaskStatus(String guid) throws IOException

Note that this information is also included in the library’s JavaDoc.

Additionally, getFiles has been changed, as a boolean is now also required, aside from the list of strings. The boolean, named suppressExceptions is, like its name indicates, used to ignore thrown exceptions.

When downloading files in bulk in the current version, a single exception will throw an error, causing the details of all downloaded samples to be lost. With the new argument, one can ignore exceptions, meaning all successfully downloaded samples are returned. A quick check for the mapping’s keys and the provided list of hashes will indicate which samples are missing, if any. The new function is given below.

//Old function
public Map<String, Byte[]> getFiles(List<String> hashes) throws IOException
//New function
public Map<String, Byte[]> getFiles(List<String> hashes, boolean suppressExceptions) throws IOException

Newly added functions

Two functions have been added, providing wrappers for addDownloadUrl and getDownloadTaskStatus when using using these functions in bulk. Both functions are given below.

public Map<String, String> addDownloadUrls(List<String> urls, boolean recursive) throws IOException
public Map<String, String> getDownloadTaskStatuses(List<String> guids, boolean suppressExceptions) throws IOException

Both functions return a mapping where the keys are equal to the provided list entries, and the value is equal to the purpose of each function.

The recursive boolean in addDownloadUrls affects all entries in the list. Creating a custom object for each URL would allow a more granular approach. This approach would also create more overhead when creating the list, compared to using two lists of URLs where one is recursively scanned and the other is not.

The suppressExceptions boolean in getDownloadTaskStatuses functions exactly the same as the one that is outlined above for the getFiles function.

Increased usability

The previously deprecated functions, which returned the server’s raw JSON response, have been removed to improve the readability of the code. Additionally, several spelling mistakes have been fixed, function behaviour has been clarified, and the readability of multi-paragraph JavaDoc comments has been improved by using newlines, bold text, and code text.

Architectural changes

The network related code has been moved into a separate class, located in MalShareConnector.java. This provides a clearer overview with regards to the network related functions, as the newly added HTTP status check is also located in that function.

The main class, named MalShareApi, now contains a private helper function named box, which is used to convert a byte[] into a Byte[]. This function is required to use byte arrays in mappings, as a boxed object is required, whereas the default return value from the HTTP library is an unboxed byte array.


To contact me, you can e-mail me at [info][at][maxkersten][dot][nl], send me a PM on Reddit, or DM me on Twitter @Libranalysis.