Viewing 13 posts - 1 through 13 (of 13 total)
  • IT/web question (hosting large volumes of files )
  • sharkbait
    Free Member

    We have quite a large number of images (approx 5 million/2TB) that are accessed by a small number of clients via a web based database search that we host ourselves.
    Until now the number of clients requiring access has been small so we have hosted everything ourselves allowing access to the database/scanned images through the firewall using IP filtering.
    Now the number of public IP addresses that we would have to allow access to is becoming more than I’m happy to deal with via filtering. So I have made access to the search engine/database more public allbeit protected by username/password access.
    The issue now is that once the clients have found the image/s they want to download they still need access to them which brings me back to filtering too many IP addresses.
    So I ‘could’ move all the images to cloud storage (utiizing username/password access again) but this would probably cost more than we could afford) or put a webserver (with access to the drives storing images) outside our firewall and use username/password protection again.
    Can’t decide. If there are alternatives I’d be interested.

    xiphon
    Free Member

    Do the clients access the files directly via HTTP/FTP? (http://website.com/image/1234.jpg

    Could you proxy through the public facing web server?

    This would provide authentication and authorisation to access the file.

    (http://website.com/image=12324.jpg)

    sharkbait
    Free Member

    yes they do access the files directly. Not sure about proxying…. will look into that one.

    atlaz
    Free Member

    Why can’t you just have a basic web front-end and control access through that? Your network should have a DMZ where you can make one server outside the trusted area so you won’t need IP based restrictions, then a web (or FTP) access method will do the rest of the control.

    It’s what I’d do at least… or get a webhost you can shove the data onto. A server to hold that couldn’t cost more than a hundred to two hundred a month.

    xiphon
    Free Member

    For the basic web front end, all you need to have a is a small script which:

    1. Authenticates the user/password combo.
    2. Authorises the above user to access the requested file.
    3. Ensure the script has sufficient security, e.g. no direct URL requests (all requests must originate from a follow-through ‘click’ on your website – *not* by typing in http://website.com/page?image=1234).
    4. Log all activity!
    5. Only traffic permitted through the FW to the file server originates from the web server itself. No direct public access.

    sharkbait
    Free Member

    3. Ensure the script has sufficient security, e.g. no direct URL requests (all requests must originate from a follow-through ‘click’ on your website – *not* by typing in http://website.com/page?image=1234).

    OK, so at the moment the user can only access the database [to search for images] after username/password verification (this is table based and carried out by the database itself).
    After verification they can seach for images and a link to the image is supplied by the database. How could I make all image requests come from the public server and not directly?

    xiphon
    Free Member

    Lets say we have a script, called image.html (language is entirely up to you – so I’ll make this generic).

    1. Script asks itself who the referrer was – was it a link on webpage (if so, which webpage? Which domain?). If the referrer was *not* “www.website.com/searchresult.html”, stop the script – with a message saying “No direct linking” etc. Basic referrer sanity check.
    Part 2: Does the user have authentication? Do they have authorisation to download the requested image file?

    2. To proxy through the web server, you will need to provide read-only access, via a network share (somehow, up to you 🙂 ). Make sure the service account the web server runs under, has read access to the images directory on the file server. The web server does the request for the file for you, and serves it up for you.

    sharkbait
    Free Member

    Cheers xiphon – I’ll look more into that.

    xiphon
    Free Member

    No probs. Be interested how you get on, and what solution use end up using.

    sharkbait
    Free Member

    Well, given that a typical path to an image download can be http://www.mydomainname/imagedump/clientname/directory/imagefilename.tif
    – two things I’m thinking of doing on the public server are:

    1) creating a directory called ‘imagedump’ and making it username/password protected – this is fairly obvious, any request for an image will generate a username/password request

    2) then creating a url redirection for /imagedump/ to make it http://my_server_ip_number/imagedump – which will then push the all image requests to my server but I was hoping this would make the requests come from the public server Ip address……… but it doesn’t. 😐

    xiphon
    Free Member

    You have to get the web server to ‘serve’ the file for you.

    Look into HTTP headers. An example of which can be seen here:

    http://www.php.net/manual/en/function.header.php#74884

    <?php
    $filename = “theDownloadedFileIsCalledThis.mp3”;
    $myFile = “/absolute/path/to/my/file.mp3”;

    $mm_type=”application/octet-stream”;

    header(“Cache-Control: public, must-revalidate”);
    header(“Pragma: hack”); // WTF? oh well, it works…
    header(“Content-Type: ” . $mm_type);
    header(“Content-Length: ” .(string)(filesize($myFile)) );
    header(‘Content-Disposition: attachment; filename=”‘.$filename.'”‘);
    header(“Content-Transfer-Encoding: binary\n”);

    readfile($myFile);

    ?>

    It’s using PHP to allow the browser to download an MP3 file… which is in a folder not web accessable.

    The script finds file.mp3 somewhere on the file system, and renames it to theDownloadedFileIsCalledThis.mp3, served up to the browser.

    In your example, it might be re-written as:

    $filename = “$clientname_$imageNumber.tif”;
    $myFile = “X:/imagedump/$clientname/$directory/$imageNumber.tif”;

    $mm_type=”image/tiff”;

    With $imageNumber, $clientname and $directory all being varibles.
    X: could be a mapped network drive to \\server2\images.

    Unfortunately, it sounds a little out of your skill range at the moment….no offence! Do you have any developers, who might be able to put together a script for you?

    brassneck
    Full Member

    You should also consider getting a proper certificate and running the service over https else all those passwords and users are in clear text on da internetz.

    I think you should drop the filtering and just make the server robust and in a DMZ – if you have to serve this yourself for cost reasons I’d go for a little linux box running vsftpd and ssh ONLY, though this would be a bit of a sod to administer lots of users on (though a little bit of scripting would make it easy enough). Other than that I’d look at an external directory to hold the ftp accounts (if it’s lots again).. but you’re getting beyond the realms of a bike forum post here 🙂

    xiphon
    Free Member

    Proftpd is another good FTP server – which supports FTPS (FTP with SSL)

Viewing 13 posts - 1 through 13 (of 13 total)

The topic ‘IT/web question (hosting large volumes of files )’ is closed to new replies.