Topic: Hash file before upload

I would like to be able to get the sha1 hash of a file prior to uploading, to verify that it is not a duplicate. I've attempted to do the following:

var preloader = new mOxie.Image();
preloader.load( MsUpload.uploader.files[0].getSource() );
CryptoJS.SHA1( preloader.getAsBinaryString() ).toString()

However, the sha1 does not match what is computed on the server. I've tried getAsBlob and getAsDataUrl as well, and attempted to use getNative instead of getSource (though this resulted in an error). The interface I'm using is HTML5.

Re: Hash file before upload

A little more information: I'm interested in modifying the MediaWiki extension MsUpload [1][2]. To expose the MsUpload object to the javascript console, I've modified MsUpload.js to add "window.MsUpload = MsUpload" at the end [3] in my local version (not in the [3] link). In my console I then add CryptoJS core and sha1 from [4] and [5].

I'm then able to attempt to get a sha1 hash of a file by doing:

var file = MsUpload.uploader.files[0];
var reader = new FileReader();
reader.onload = (function(theFile) {
    return function(e) {
        var sha1 = CryptoJS.SHA1( e.target.result ).toString();
        console.log( sha1 );
    };
})(file);

And initiate the onload event with one of these:

reader.readAsDataURL( file.getNative() ); // no error but incorrect sha1
reader.readAsText( file.getNative() ); // no error but incorrect sha1
reader.readAsBinaryString( file.getNative() ); // no error but incorrect sha1
reader.readAsArrayBuffer( file.getNative() ); // fails in chrome

Thank you for any assistance.

[1] https://www.mediawiki.org/wiki/Extension:MsUpload
[2] https://github.com/wikimedia/mediawiki-extensions-MsUpload
[3] https://github.com/wikimedia/mediawiki-extensions-MsUpload/blob/master/MsUpload.js#L499
[4] https://cdnjs.cloudflare.com/ajax/libs/crypto-js/3.1.2/components/core-min.js
[5] https://cdnjs.cloudflare.com/ajax/libs/crypto-js/3.1.2/components/sha1.js

Re: Hash file before upload

Another update. From this [1] method of generating a sha1 sum I made the readAsArrayBuffer method work. Still not producing the correct sha1, though.

var file = MsUpload.uploader.files[0].getNative();
var sha1 = CryptoJS.algo.SHA1.create();
var read = 0;
var unit = 1024 * 1024;
var blob;
var reader = new FileReader();
reader.onload = function(e) {

    var bytes = CryptoJS.lib.WordArray.create(
        e.target.result, e.target.result.byteLength );
    sha1.update(bytes);
    read += unit;
    
    if (read < file.size) {
        blob = file.slice(read, read + unit);
        reader.readAsArrayBuffer(blob);
    } else {
        var hash = sha1.finalize();
        // print the result
        console.log(hash.toString(CryptoJS.enc.Hex));
    }

};
reader.readAsArrayBuffer(file.slice(read, read + unit));

Aside from the required change on accessing the file object (), I also modified [1] to set a second argument to CryptoJS.lib.WordArray.create(). This is the sigBytes argument, without which I got an "Uncaught RangeError: Invalid array length" error.

[1] https://gist.github.com/npcode/11282867

Re: Hash file before upload

Plupload doesn't affect files in any way, unless you have client-side resize enabled.

Have you tried to generate hash without Plupload, with CryptoJS only? You could also compare it to the one you generate manually on server and see if they match.

If you want to see your issue fixed, do not report it here, do it on - GitHub.

Re: Hash file before upload

You are correct. I did not have CryptoJS's "components/lib-typedarrays-min.js" module loaded. Without it, it generated an incorrect but correct-looking sha1. With it I get the correct value. For reference, I submitted an issue to them here: https://github.com/brix/crypto-js/issues/82. Sorry for wasting your time and thanks for a great product.