Apa maksud archive di komputer

In computing, an archive file is a computer file that is composed of one or more files along with metadata. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress files to use less storage space. Archive files often store directory structures, error detection and correction information, arbitrary comments, and sometimes use built-in encryption.[1][2][3]

Table of Contents Show

Software distribution
Compression methods
Pre-processing filters
Limitations
Video yang berhubungan

Archive files are particularly useful in that they store file system data and metadata within the contents of a particular file, and thus can be stored on systems or sent over channels that do not support the file system in question, only file contents – examples include sending a directory structure over email, files with names unsupported on the target file system due to length or characters, and retaining files' date and time information.[4]

Additionally, it facilitates transferring high numbers of small files such as resources of saved web pages, since a container file is transferred using a single file operation, whereas transferring many small files requires the computer to modify the file system structure for each file individually, making it considerably slower.[5][6]

Software distribution

Beyond archival purposes, archive files are frequently used for packaging software for distribution, as software contents are often naturally spread across several files; the archive is then known as a package. While the archival file format is the same, there are additional conventions about contents, such as requiring a manifest file, and the resulting format is known as a package format.[7] Examples include deb for Debian, JAR for Java, APK for Android, and self-extracting Windows Installer executables.

Features supported by various kinds of archives include:

converting metadata into data stored inside a file (e.g., file name, permissions, etc.)
checksums to detect errors
data compression
file concatenation to store multiple files in a single file
file patches / updates (when recording changes since a previous archive)
encryption
error correction code to fix errors
splitting a large file into many equal sized files for storage or transmission

Some archive programs have self-extraction, self-installation, source volume and medium information, and package notes/description.

The file extension or file header of the archive file are indicators of the file format used. Computer archive files are created by file archiver software, optical disc authoring software, and disk image software.[8]

An archive format is the file format of an archive file. Some formats are well-defined by their authors and have become conventions supported by multiple vendors and communities.[9]

Types

Archiving only formats store metadata and concatenate files.
Compression only formats only compress files.
Multi-function formats can store metadata, concatenate, compress, encrypt, create error detection and recovery information, and package the archive into self-extracting and self-expanding files.
Software packaging formats are used to create software packages that may be self-installing files.
Disk image formats are used to create disk images of mass storage volumes.

Examples

Filename extensions used to distinguish different types of archives include zip, rar, 7z, and tar, the first of which is the most widely implemented.[10]

Java also introduced a whole family of archive extensions such as jar and war (j is for Java and w is for web). They are used to exchange entire byte-code deployment. Sometimes they are also used to exchange source code and other text, HTML and XML files. By default they are all compressed.[11]

Archive files often include parity checks and other checksums for error detection, for instance zip files use a cyclic redundancy check (CRC). RAR archives may include additional error correction data (called recovery records).[12]

Archive files that do not natively support recovery records can use separate parchive (PAR) files that allows for additional error correction and recovery of missing files in a multi-file archive.[13]

File archiver
Disk image
Digital container format, a similar concept in media files

^ "Archive File: What it's Used For". Lifewire. Retrieved 2022-06-17.
^ "Archive files". www.ibm.com. 2015-02-07. Retrieved 2022-06-17.
^ "What is Archiving And Why is it Important?". Secure Data MGT. 2015-03-23. Retrieved 2022-06-17.
^ "Data Portability and Platform Competition | Is User Data Exported From Facebook Actually Useful to Competitors?" (PDF). Archive.org. p. 22. Retrieved June 17, 2022.
^ "Why file transfer speeds of small vs large files could be different". NetApp Knowledge Base. 2020-06-17. Retrieved 2022-06-17.
^ "Why Small Files Take Longer to Copy Than Large Files". Dataquest. 2018-10-10. Retrieved 2022-06-17.
^ Manager, Amit Ashbel, Senior Marketing and Strategy. "Data Archiving: The Basics and 5 Best Practices". cloud.netapp.com. Retrieved 2022-06-17.
^ "What Is a File Extension & Why Are They Important?". Lifewire. Retrieved 2022-06-17.
^ "What are Archive Files?". www.exefiles.com. Retrieved 2022-06-17.
^ "Common file name extensions in Windows". support.microsoft.com. Retrieved 2022-06-17.
^ Malefanem, Moses. "Learning Java Network Programming". {{cite journal}}: Cite journal requires |journal= (help)
^ Drummond, James R. (1997). Parity, Checksums and CRC Checks (PDF) (1st ed.). Toronto. p. 13.
^ text. "What are PAR and PAR2 Files?". Easynews. Retrieved 2022-06-17.

"Application Note on the .ZIP file format"- official white paper published by PKWARE, Inc.
Tape Archive (.TAR) file format specification- excerpt from File Format List 2.0 by Max Maischein
"IBM 726 Magnetic tape reader/recorder from IBM Archives
"1401 Data Processing System" from IBM Archives

File Archive formats at Curlie

Retrieved from "https://en.wikipedia.org/w/index.php?title=Archive_file&oldid=1099573797"

Page 2

7z is a compressed archive file format that supports several different data compression, encryption and pre-processing algorithms. The 7z format initially appeared as implemented by the 7-Zip archiver. The 7-Zip program is publicly available under the terms of the GNU Lesser General Public License. The LZMA SDK 4.62 was placed in the public domain in December 2008. The latest stable version of 7-Zip and LZMA SDK is version 22.01.[2]

7z file formatFilename extension

.7z

Internet media type

application/x-7z-compressed

Uniform Type Identifier (UTI)org.7-zip.7-zip-archiveMagic number'7', 'z', 0xBC, 0xAF, 0x27, 0x1CDeveloped byIgor Pavlov[1]Initial release1999; 23 years ago (1999)[2]Type of formatData compressionOpen format?Yes: GNU Lesser General Public License / Public domainWebsite7-zip.org

The official, informal 7z file format specification is distributed with 7-Zip's source code since 2015. The specification can be found in plain text format in the 'doc' sub-directory of the source code distribution.[3] There have been additional third-party attempts at writing more concrete documentation based on the released code.[4]

The 7z format provides the following main features:

Open, modular architecture that allows any compression, conversion, or encryption method to be stacked.
High compression ratios (depending on the compression method used).
AES-256 bit encryption.
Zip 2.0 (Legacy) Encryption
Large file support (up to approximately 16 exbibytes, or 264 bytes).
Unicode file names.
Support for solid compression, where multiple files of like type are compressed within a single stream, in order to exploit the combined redundancy inherent in similar files.
Compression and encryption of archive headers.
Support for multi-part archives : e.g. xxx.7z.001, xxx.7z.002, ... (see the context menu items Split File... to create them and Combine Files... to re-assemble an archive from a set of multi-part component files).
Support for custom codec plugin DLLs.

The format's open architecture allows additional future compression methods to be added to the standard.

Compression methods

The following compression methods are currently defined:

LZMA – A variation of the LZ77 algorithm, using a sliding dictionary up to 4 GB in length for duplicate string elimination. The LZ stage is followed by entropy coding using a Markov chain-based range coder and binary trees.
LZMA2 – modified version of LZMA providing better multithreading support and less expansion of incompressible data.[5]
Bzip2 – The standard Burrows–Wheeler transform algorithm. Bzip2 uses two reversible transformations; BWT, then Move to front with Huffman coding for symbol reduction (the actual compression element).
PPMd – Dmitry Shkarin's 2002 PPMdH (PPMII/cPPMII) with small changes: PPMII is an improved version of the 1984 PPM compression algorithm (prediction by partial matching).
DEFLATE – Standard algorithm based on 32 kB LZ77 and Huffman coding. Deflate is found in several file formats including ZIP, gzip, PNG and PDF. 7-Zip contains a from-scratch DEFLATE encoder that frequently beats the de facto standard zlib version in compression size, but at the expense of CPU usage.

A suite of recompression tools called AdvanceCOMP contains a copy of the DEFLATE encoder from the 7-Zip implementation; these utilities can often be used to further compress the size of existing gzip, ZIP, PNG, or MNG files.

Pre-processing filters

The LZMA SDK comes with the BCJ and BCJ2 preprocessors included, so that later stages are able to achieve greater compression: For x86, ARM, PowerPC (PPC), IA-64 Itanium, and ARM Thumb processors, jump targets are 'normalized' [5] before compression by changing relative position into absolute values. For x86, this means that near jumps, calls and conditional jumps (but not short jumps and conditional jumps) are converted from the machine language "jump 1655 bytes backwards" style notation to normalized "jump to address 5554" style notation; all jumps to 5554, perhaps a common subroutine, are thus encoded identically, making them more compressible.

BCJ – Converter for 32-bit x86 executables. Normalise target addresses of near jumps and calls from relative distances to absolute destinations.
BCJ2– Pre-processor for 32-bit x86 executables. BCJ2 is an improvement on BCJ, adding additional x86 jump/call instruction processing. Near jump, near call, conditional near jump targets are split out and compressed separately in another stream.
Delta encoding – delta filter, basic preprocessor for multimedia data.

Similar executable pre-processing technology is included in other software; the RAR compressor features displacement compression for 32-bit x86 executables and IA-64 executables, and the UPX runtime executable file compressor includes support for working with 16-bit values within DOS binary files.

Encryption

The 7z format supports encryption with the AES algorithm with a 256-bit key. The key is generated from a user-supplied passphrase using an algorithm based on the SHA-256 hash function. The SHA-256 is executed 218 (262144) times,[6] which causes a significant delay on slow PCs before compression or extraction starts. This technique is called key stretching and is used to make a brute-force search for the passphrase more difficult. Current GPU-based, and custom hardware attacks limit the effectiveness of this particular method of key stretching,[7] so it is still important to choose a strong password. The 7z format provides the option to encrypt the filenames of a 7z archive.

Limitations

The 7z format does not store filesystem permissions (such as UNIX owner/group permissions or NTFS ACLs), and hence can be inappropriate for backup/archival purposes. A workaround on UNIX-like systems for this is to convert data to a tar bitstream before compressing with 7z. But it is worth noting that GNU tar (common in many UNIX environments) can also compress with the LZMA2 algorithm ("xz") natively, without the use of 7z, using the "-J" switch. The resulting file extension is ".tar.xz" or ".txz" and not ".tar.7z". This method of compression has been adopted with many distributions for packaging, such as Arch, Debian (deb), Fedora (rpm) and Slackware. (The older "lzma" format is less efficient.)[8] On the other hand, it is important to note, that tar does not save the filesystem encoding, which means that tar compressed filenames can become unreadable if decompressed on a different computer.

The 7z format does not allow extraction of some "broken files"—that is (for example) if one has the first segment of a series of 7z files, 7z cannot give the start of the files within the archive—it must wait until all segments are downloaded. The 7z format also lacks recovery records, making it vulnerable to data degradation unless used in conjunction with external solutions, like parchives, or within filesystems with robust error-correction. By way of comparison, zip files also lack a recovery feature while the rar format has one.

Comparison of archive formats
List of archive formats
Open format

^ "A Few Questions for Igor Pavlov". Dr. Dobb's Data Compression Newsletter. 30 April 2003. Retrieved 26 December 2009.
^ a b History of 7-zip changes
^ LZMA SDK, "DOC" directory, 7zFormat.txt
^ ".7z format specification — py7zr – 7-zip archive library". py7zr.readthedocs.io.
^ a b Collin, Lasse. "lzma_.lzma". liblzma bindings. Archived from the original on 8 February 2010. Retrieved 3 January 2010. Compared to LZMA1, LZMA2 adds support for LZMA_SYNC_FLUSH, uncompressed chunks (smaller expansion when trying to compress uncompressible data), possibility to change lc/lp/pb in the middle of encoding, and some other internal improvements.
^ 7-zip source code
^ Colin Percival. scrypt. As presented in "Stronger Key Derivation via Sequential Memory-Hard Functions". presented at BSDCan'09, May 2009.
^ "GNU tar 1.34: 8.1 Using Less Space through Compression".