Skip to content

gitter-badger/BigFileTools

 
 

Repository files navigation

BigFileTools

Build Status - Linux Build status - Windows Code Climate Latest Stable Version License Total Downloads Monthly Downloads

This project is collection of hacks that are needed to manipulate files over 2GB in PHP. Currently there is support for getting exact file size. This project is originally answer for stackoverflow question.

Example usage:

require '../vendor/autoload.php';
use BigFileTools\BigFileTools;

// alternatively you can use BigFileTools::createFrom([drivers]) to provide custom drivers
$file = BigFileTools::createDefault()->getFile(__FILE__);
echo "Example file size is " . $file->getSize() . " bytes\n";

Will produce output:

string(3) "176"
Example file size is 176 bytes

Please note, that getSize() returns Brick\BigInteger. This is due to fact that PHP's internal integer can be too small.

To get approximate value of file size you can convert BigInteger into float. Please note that by doing this you will loose precision.

Tip: If you are using DI container you are probably interested into non-static configuration. There is example prepared for this scenario. (see examples directory)

Will this really work?

To test things out this project uses Travis to evaluate UNIX systems and Appveyor to check compatibility with Windows. Every driver in this project is tested to return exact values for files over 2^32 in size. More about testing in tests directory.

Under the hood

To get insight into what is happening we need a little introduction in how numbers are represented in digital world.

The problem lies in the fact that PHP uses 32-bit signed integer on most platforms. PHPx64 for Linux 64-bit version uses 64-bit so you do not need this library there anymore. On the other hand 64-bit version of PHP for 64-bit Windows uses 32-bit integer. Because PHP uses signed integers this means that there is one bit for sign (+-) and rest is used for value.

32-bit signed integer max value: +2^31 =             2 147 483 648 ~             2 gigabytes
64-bit signed integer max value: +2^63 = 9 223 372 036 854 775 808 ~ 9 223 372 000 gigabytes

To overcome this problem this library uses string representation of numbers which means that only you RAM is limit of number size.

Caution: There are tons of non-solutions for this problem. Most of them looks like sprintf("%u", filesize($file));. This does NOT solve problem. If just shifts it a little. The %u assumes give value as unsigned integer. This means that first signing bit is treated also as a value. Unfortunately this means that boundary was just shifted from 2 GB limit to 4 GB.

Second problem is that standard file manipulation APIs fails with strange errors or returns weird values. That is why BigFileTools has drivers. They are by default executed from the fastest to the slowest and unsupported ones are skipped.

Drivers

Currently there is support for size drivers - drivers for obtaining file size.

Selecting default drivers and their order of drivers is done based on two factors - availability and speed.

Driver Time (s) ↓ Runtime requirements Platform
CurlDriver 0.00045299530029297 CURL extension -
NativeSeekDriver 0.00052094459533691 - -
ComDriver 0.0031449794769287 COM+.NET extension Windows only
ExecDriver 0.042937040328979 exec() enabled Windows, POSIX
NativeRead 2.7670161724091 - -

In default configuration size drivers are ordered by speed and unavailable ones are skipped. This means that in default configuration you do not need to worry about compatibility.

Requirements

Please follow Composer requirements.

To speed things up (e.g. in production) I recommend installing CURL extension.

About

Manipulate files over 2GB in PHP on all platforms with byte-precision without headache

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 97.6%
  • Shell 1.2%
  • Batchfile 1.2%