Loading Documentation/vm/cleancache.txt +62 −43 Original line number Diff line number Diff line MOTIVATION .. _cleancache: ========== Cleancache ========== Motivation ========== Cleancache is a new optional feature provided by the VFS layer that potentially dramatically increases page cache effectiveness for Loading @@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented in Xen (using hypervisor memory) and zcache (using in-kernel compressed memory) and other implementations are in development. FAQs are included below. :ref:`FAQs <faq>` are included below. IMPLEMENTATION OVERVIEW Implementation Overview ======================= A cleancache "backend" that provides transcendent memory registers itself to the kernel's cleancache "frontend" by calling cleancache_register_ops, Loading Loading @@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page with the same handle, the results are indeterminate. Callers must lock the page to ensure serial behavior. CLEANCACHE PERFORMANCE METRICS Cleancache Performance Metrics ============================== If properly configured, monitoring of cleancache is done via debugfs in the /sys/kernel/debug/cleancache directory. The effectiveness of cleancache the `/sys/kernel/debug/cleancache` directory. The effectiveness of cleancache can be measured (across all filesystems) with: succ_gets - number of gets that were successful failed_gets - number of gets that failed puts - number of puts attempted (all "succeed") invalidates - number of invalidates attempted ``succ_gets`` number of gets that were successful ``failed_gets`` number of gets that failed ``puts`` number of puts attempted (all "succeed") ``invalidates`` number of invalidates attempted A backend implementation may provide additional metrics. .. _faq: FAQ === 1) Where's the value? (Andrew Morton) * Where's the value? (Andrew Morton) Cleancache provides a significant performance benefit to many workloads in many environments with negligible overhead by improving the Loading Loading @@ -137,7 +156,7 @@ device that stores pages of data in a compressed state. And the proposed "RAMster" driver shares RAM across multiple physical systems. 2) Why does cleancache have its sticky fingers so deep inside the * Why does cleancache have its sticky fingers so deep inside the filesystems and VFS? (Andrew Morton and Christoph Hellwig) The core hooks for cleancache in VFS are in most cases a single line Loading Loading @@ -168,9 +187,9 @@ filesystems in the future. The total impact of the hooks to existing fs and mm files is only about 40 lines added (not counting comments and blank lines). 3) Why not make cleancache asynchronous and batched so it can more easily interface with real devices with DMA instead of copying each individual page? (Minchan Kim) * Why not make cleancache asynchronous and batched so it can more easily interface with real devices with DMA instead of copying each individual page? (Minchan Kim) The one-page-at-a-time copy semantics simplifies the implementation on both the frontend and backend and also allows the backend to Loading @@ -182,7 +201,7 @@ are avoided. While the interface seems odd for a "real device" or for real kernel-addressable RAM, it makes perfect sense for transcendent memory. 4) Why is non-shared cleancache "exclusive"? And where is the * Why is non-shared cleancache "exclusive"? And where is the page "invalidated" after a "get"? (Minchan Kim) The main reason is to free up space in transcendent memory and Loading @@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call. The invalidate is done by the cleancache backend implementation. 5) What's the performance impact? * What's the performance impact? Performance analysis has been presented at OLS'09 and LCA'10. Briefly, performance gains can be significant on most workloads, Loading @@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache has little value, but in newer multicore machines, especially consolidated/virtualized machines, it has great value. 6) How do I add cleancache support for filesystem X? (Boaz Harrash) * How do I add cleancache support for filesystem X? (Boaz Harrash) Filesystems that are well-behaved and conform to certain restrictions can utilize cleancache simply by making a call to Loading Loading @@ -236,7 +255,7 @@ Some points for a filesystem to consider: - A clustered FS should invoke the "shared_init_fs" cleancache hook to get best performance for some backends. 7) Why not use the KVA of the inode as the key? (Christoph Hellwig) * Why not use the KVA of the inode as the key? (Christoph Hellwig) If cleancache would use the inode virtual address instead of inode/filehandle, the pool id could be eliminated. But, this Loading @@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache is potentially much larger than the kernel pagecache and is most useful if the pages survive inode cache removal. 8) Why is a global variable required? * Why is a global variable required? The cleancache_enabled flag is checked in all of the frequently-used cleancache hooks. The alternative is a function call to check a static Loading @@ -262,13 +281,13 @@ global variable allows cleancache to be enabled by default at compile time, but have insignificant performance impact when cleancache remains disabled at runtime. 9) Does cleanache work with KVM? * Does cleanache work with KVM? The memory model of KVM is sufficiently different that a cleancache backend may have less value for KVM. This remains to be tested, especially in an overcommitted system. 10) Does cleancache work in userspace? It sounds useful for * Does cleancache work in userspace? It sounds useful for memory hungry caches like web browsers. (Jamie Lokier) No plans yet, though we agree it sounds useful, at least for Loading Loading
Documentation/vm/cleancache.txt +62 −43 Original line number Diff line number Diff line MOTIVATION .. _cleancache: ========== Cleancache ========== Motivation ========== Cleancache is a new optional feature provided by the VFS layer that potentially dramatically increases page cache effectiveness for Loading @@ -21,9 +28,10 @@ Transcendent memory "drivers" for cleancache are currently implemented in Xen (using hypervisor memory) and zcache (using in-kernel compressed memory) and other implementations are in development. FAQs are included below. :ref:`FAQs <faq>` are included below. IMPLEMENTATION OVERVIEW Implementation Overview ======================= A cleancache "backend" that provides transcendent memory registers itself to the kernel's cleancache "frontend" by calling cleancache_register_ops, Loading Loading @@ -80,22 +88,33 @@ different Linux threads are simultaneously putting and invalidating a page with the same handle, the results are indeterminate. Callers must lock the page to ensure serial behavior. CLEANCACHE PERFORMANCE METRICS Cleancache Performance Metrics ============================== If properly configured, monitoring of cleancache is done via debugfs in the /sys/kernel/debug/cleancache directory. The effectiveness of cleancache the `/sys/kernel/debug/cleancache` directory. The effectiveness of cleancache can be measured (across all filesystems) with: succ_gets - number of gets that were successful failed_gets - number of gets that failed puts - number of puts attempted (all "succeed") invalidates - number of invalidates attempted ``succ_gets`` number of gets that were successful ``failed_gets`` number of gets that failed ``puts`` number of puts attempted (all "succeed") ``invalidates`` number of invalidates attempted A backend implementation may provide additional metrics. .. _faq: FAQ === 1) Where's the value? (Andrew Morton) * Where's the value? (Andrew Morton) Cleancache provides a significant performance benefit to many workloads in many environments with negligible overhead by improving the Loading Loading @@ -137,7 +156,7 @@ device that stores pages of data in a compressed state. And the proposed "RAMster" driver shares RAM across multiple physical systems. 2) Why does cleancache have its sticky fingers so deep inside the * Why does cleancache have its sticky fingers so deep inside the filesystems and VFS? (Andrew Morton and Christoph Hellwig) The core hooks for cleancache in VFS are in most cases a single line Loading Loading @@ -168,9 +187,9 @@ filesystems in the future. The total impact of the hooks to existing fs and mm files is only about 40 lines added (not counting comments and blank lines). 3) Why not make cleancache asynchronous and batched so it can more easily interface with real devices with DMA instead of copying each individual page? (Minchan Kim) * Why not make cleancache asynchronous and batched so it can more easily interface with real devices with DMA instead of copying each individual page? (Minchan Kim) The one-page-at-a-time copy semantics simplifies the implementation on both the frontend and backend and also allows the backend to Loading @@ -182,7 +201,7 @@ are avoided. While the interface seems odd for a "real device" or for real kernel-addressable RAM, it makes perfect sense for transcendent memory. 4) Why is non-shared cleancache "exclusive"? And where is the * Why is non-shared cleancache "exclusive"? And where is the page "invalidated" after a "get"? (Minchan Kim) The main reason is to free up space in transcendent memory and Loading @@ -193,7 +212,7 @@ be easily extended to add a "get_no_invalidate" call. The invalidate is done by the cleancache backend implementation. 5) What's the performance impact? * What's the performance impact? Performance analysis has been presented at OLS'09 and LCA'10. Briefly, performance gains can be significant on most workloads, Loading @@ -206,7 +225,7 @@ single-core systems with slow memory-copy speeds, cleancache has little value, but in newer multicore machines, especially consolidated/virtualized machines, it has great value. 6) How do I add cleancache support for filesystem X? (Boaz Harrash) * How do I add cleancache support for filesystem X? (Boaz Harrash) Filesystems that are well-behaved and conform to certain restrictions can utilize cleancache simply by making a call to Loading Loading @@ -236,7 +255,7 @@ Some points for a filesystem to consider: - A clustered FS should invoke the "shared_init_fs" cleancache hook to get best performance for some backends. 7) Why not use the KVA of the inode as the key? (Christoph Hellwig) * Why not use the KVA of the inode as the key? (Christoph Hellwig) If cleancache would use the inode virtual address instead of inode/filehandle, the pool id could be eliminated. But, this Loading @@ -251,7 +270,7 @@ of cleancache would be lost because the cache of pages in cleanache is potentially much larger than the kernel pagecache and is most useful if the pages survive inode cache removal. 8) Why is a global variable required? * Why is a global variable required? The cleancache_enabled flag is checked in all of the frequently-used cleancache hooks. The alternative is a function call to check a static Loading @@ -262,13 +281,13 @@ global variable allows cleancache to be enabled by default at compile time, but have insignificant performance impact when cleancache remains disabled at runtime. 9) Does cleanache work with KVM? * Does cleanache work with KVM? The memory model of KVM is sufficiently different that a cleancache backend may have less value for KVM. This remains to be tested, especially in an overcommitted system. 10) Does cleancache work in userspace? It sounds useful for * Does cleancache work in userspace? It sounds useful for memory hungry caches like web browsers. (Jamie Lokier) No plans yet, though we agree it sounds useful, at least for Loading