-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement FileDescriptor.memoryMap
, FileDescriptor.memoryUnmap
, and FileDescriptor.memorySync
#68
base: main
Are you sure you want to change the base?
Conversation
…nd `FileDescriptor.memorySync`
…Mappings` parameter in `memorySync`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very much welcome. System definitely needs mmap
-- but we should expose its full power!
I did an initial review pass and noted many suggestions for improvements.
To correctly use mmap, we probably also need to expose a way to retrieve the page size (via getpagesize
, sysconf
, sysctl
, or one of a myriad other ways I'm sure). This could technically be deferred, but without a known page size, some of mmap's use cases would be a bit limited.
@_alwaysEmitIntoClient | ||
private init(_ raw: CInt) { self.init(rawValue: raw) } | ||
|
||
/// Pages may not be accessed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docs should specify the original C names of these constants (and types, where appropriate), to help orientate people who are already familiar with them, and to aid searching.
/// Pages may not be accessed. | |
/// Pages may not be accessed. | |
/// | |
/// The corresponding C constant is `PROT_NONE`. |
(This applies to all of these.)
@_alwaysEmitIntoClient | ||
public static var `private`: MemoryMapKind { MemoryMapKind(rawValue: _MAP_PRIVATE) } | ||
|
||
// TODO: There are several other MemoryMapKinds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should default to exposing every flag that is present in the underlying system headers. (This will be different on every supported platform.)
E.g., on Darwin, we have:
#define MAP_FIXED 0x0010 /* [MF|SHM] interpret addr exactly */
#define MAP_RENAME 0x0020 /* Sun: rename private pages to file */
#define MAP_NORESERVE 0x0040 /* Sun: don't reserve needed swap area */
#define MAP_RESERVED0080 0x0080 /* previously unimplemented MAP_INHERIT */
#define MAP_NOEXTEND 0x0100 /* for MAP_FILE, don't change file size */
#define MAP_HASSEMAPHORE 0x0200 /* region may contain semaphores */
#define MAP_NOCACHE 0x0400 /* don't cache pages for this mapping */
#define MAP_JIT 0x0800 /* Allocate a region that will be used for JIT purposes */
#define MAP_FILE 0x0000 /* map from file (default) */
#define MAP_ANON 0x1000 /* allocated from memory, swap space */
#define MAP_ANONYMOUS MAP_ANON
#define MAP_RESILIENT_CODESIGN 0x2000 /* no code-signing failures */
#define MAP_RESILIENT_MEDIA 0x4000 /* no backing-store failures */
#define MAP_32BIT 0x8000 /* Return virtual addresses <4G only */
#define MAP_TRANSLATED_ALLOW_EXECUTE 0x20000 /* allow execute in translated processes */
#define MAP_UNIX03 0x40000 /* UNIX03 compliance */
/// visible to other processes mapping the same region, and whether | ||
/// updates are carried through to the underlying file. This | ||
/// behavior is determined by exactly one flag. | ||
public struct MemoryMapKind: RawRepresentable, Hashable, Codable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
POSIX calls these "flags"; we should follow the same terminology.
public struct MemoryMapKind: RawRepresentable, Hashable, Codable { | |
public struct MemoryMapFlags: RawRepresentable, Hashable, Codable { |
Like MemoryProtection
, this needs to also be an OptionSet
.
public static var write: MemoryProtection { MemoryProtection(rawValue: _PROT_WRITE) } | ||
/// Pages may be executed. | ||
@_alwaysEmitIntoClient | ||
public static var executed: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public static var executed: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) } | |
public static var execute: MemoryProtection { MemoryProtection(rawValue: _PROT_EXEC) } |
/// mapping (and must not conflict with the open mode of the file). | ||
/// Flags can be the bitwise OR of one or more of each case. | ||
@frozen | ||
public struct MemoryProtection: RawRepresentable, Hashable, Codable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be an OptionSet
!
/// Specifies that an update be scheduled, but the call | ||
/// returns immediately. | ||
@_alwaysEmitIntoClient | ||
public static var asynchronous: MemorySyncKind { MemorySyncKind(rawValue: _MS_ASYNC) } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we also want MS_INVALIDATE
here.
On Darwin, we probably also want to expose MS_KILLPAGES
and MS_DEACTIVATE
.
public func memoryMap( | ||
length: Int, pageOffset: Int, kind: MemoryMapKind, protection: [MemoryProtection] | ||
) throws -> UnsafeMutableRawPointer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs significant changes.
- I think
mapMemory
would be a better name for this function. MemoryProtection
needs to be anOptionSet
. The protection argument must not be an array.- The term
pageOffset
(and its documentation above) does not match whatmmap
'soffset
argument means -- it's an offset into the file object referenced by the file descriptor. - The type of the offset must be
Int64
, to matchFileDescriptor.seek
. - If we want to reorder arguments this way, then the offset ought to precede the length, not vice versa.
- We must expose
mmap
'saddr
parameter (as an optional UnsafeMutableRawPointer, with a nil default). - The original
mmap
'sfd
parameter is optional. (MAP_ANONYMOUS mappings are very important, and they do not take a file descriptor.) This means that we must either move this method out ofFileDescriptor
, or we must introduce a second, top-levelmapMemory
that's dedicated to the anonymous case.
Note that for anonymous maps, Darwin uses thefd
argument as an extra information channel (seeVM_FLAGS_PURGABLE
,VM_MAKE_TAG()
) -- this may be an argument in favor of spinning of a separate function for that use case. mmap
can technically be used to map memory at address zero, i.e., the null pointer, which isn't representable by anUnsafeMutableRawPointer
. This is an extremely niche edge case, but I think it would make sense to change the return type to an optional to reflect this.
Here is one way to embrace these suggestions:
public struct MemoryProtection { ... }
public struct MemoryMapFlags { ... }
extension FileDescriptor {
public func mapMemory(
at address: UnsafeMutableRawPointer? = nil,
from offset: Int64,
length: Int,
protection: MemoryProtection,
flags: MemoryMapFlags,
) throws -> UnsafeMutableRawPointer?
}
public struct MachVMFlags: RawRepresentable {
public var rawValue: CInt
public init?(rawValue: CInt)
public static var purgeable: Self { /* VM_FLAGS_PURGABLE */ }
public static func tag(_ value: UInt8) -> Self
}
public func mapAnonymousMemory(
at address: UnsafeMutableRawPointer? = nil,
length: Int,
protection: MemoryProtection,
flags: MemoryMapFlags,
machVMFlags: MachVMFlags? = nil
) throws -> UnsafeMutableRawPointer?
(If we go in this direction, then both public variants must forward to the same @usableFromInline
implementation.)
Given that either MAP_SHARED
or MAP_PRIVATE
must always be present in the flags, it may make sense to model these as something other than mere flags -- perhaps going as far as splitting mapMemory
into two variants, mapSharedMemory
and mapPrivateMemory
. Then again, this would mean that we'd also need to split the anonymous variant, and this treads into slightly dangerous territory -- we could end up painting ourselves into a combinatorial explosion in case a platform decides to introduce new flavors in addition to these two.
/// - memoryMap: The memory map to unmap | ||
/// - length: Amount in bytes to unmap. | ||
@_alwaysEmitIntoClient | ||
public func memoryUnmap(memoryMap: UnsafeMutableRawPointer, length: Int) throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this belongs under FileDescriptor
. If we don't introduce a namespace for these, then this needs to be a top-level function.
I think unmapMemory
would be a better name for this. The pointer argument needs to be optional and I think it would make sense to label it at
.
public func memoryUnmap(memoryMap: UnsafeMutableRawPointer, length: Int) throws { | |
public func unmapMemory( | |
at address: UnsafeMutableRawPointer?, | |
length: Int | |
) throws |
public func memorySync( | ||
memoryMap: UnsafeMutableRawPointer, | ||
length: Int, | ||
kind: MemorySyncKind, | ||
invalidateOtherMappings: Bool = false | ||
) throws { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this belongs under FileDescriptor
. If we don't introduce a namespace for these, then this needs to be a top-level function.
I don't think we should split the MS_INVALIDATE
flag out of the rest -- there are additional (system-specific) flags that we also need to model, and it makes sense to keep things simple/consistent.
public func memorySync( | |
memoryMap: UnsafeMutableRawPointer, | |
length: Int, | |
kind: MemorySyncKind, | |
invalidateOtherMappings: Bool = false | |
) throws { | |
public func syncMemory( | |
at address: UnsafeMutableRawPointer, | |
length: Int, | |
flags: MemorySyncFlags | |
) throws { |
We can also have this function take an UnsafeMutableRawBuffer
instead of the first two arguments. (I'm not sure if that would be more or less convenient though. If we take this route, we probably need to adjust the return value of the mmap wrappers accordingly, as well as the parameters of munmap etc.
/// Determines whether memory sync should be | ||
/// synchronous, asynchronous. | ||
@frozen | ||
public struct MemorySyncKind: RawRepresentable, Hashable, Codable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like MemoryMapKind
, I think this should be modeling the original (open-ended) flags. It ought to be called MemorySyncFlags
, and it needs to be an OptionSet
, not an enum-like struct.
There is this whole family of address space-related functionality that we can't currently group under a specific type. This includes the anonymous flavor of @milseman are you okay with just pulling these in as top-level functions? We could choose to instead define a namespacing enum, and add all of these as members: enum Memory {
static var pageSize: Int { get }
// properties for getting the cache line size, cache sizes etc.
func map(at:, length:, protection:, flags:, from:, offset:) throws -> UMRP
func unmap(at:, length:) throws
func protect(at:, length:, protection:) throws
func sync(at:, length: flags:) throws
// BSD/Darwinisms
func advise(at:, length:, advice:) throws
func inherit(at:, length:, inherit:) throws
func isInCore(at:, length:) throws -> ??? // mincore (BSD)
// Realtime stuff
func lock(at:, length:) throws
func unlock(at:, length:) throws
func lockAll(flags:) throws
func unlockAll() throws
} This could be an attractive option, if for no other reason but to move these really low-level functions out of highly visible namespaces, such as |
For sysconf, we have #33, which we can pull stuff from any time. Those sketches also have the general need for a namespace for global functions. Should we have many or just one? E.g. |
Oh, good point! I forgot we wanted to introduce I think if we add one of these, we may as well add as many as we need. (I remember being worried about adding file operations like copy/rename/remove into a |
Add Swift implementations for
mmap
,munmap
, andmsync
:This PR adds all three methods because the ability to sync and delete mappings are sensible requirements of being able to create them.
Add
FileDescriptor.MemoryProtection
to describe the desired memory protection of a mapping. ThememoryMap
method accepts these values as an array to be bitwise OR'd by the implementation.Add
shared
andprivate
forFileDescriptor.MemoryMapKind
. There are more C flags available, but this seemed like the simplest starting point.Add
synchronous
andasynchronous
FileDescriptor.MemorySyncKind
. ThememorySync
method accepts either one of these values and aninvalidateOtherMappings
boolean that is bitwise AND'd in the implementation to represent the appropriate C flag.Add internal constants for flags/values related to
mmap
,munmap
, andmsync
.Add additional methods for more generic mocking and error handling. This seemed like a good addition for system calls that do not return
CInt
.This PR does not include a Windows implementation.
Open question: Should memory mappings be a unique type?
Arguments in favor:
UnsafeMutableRawPointer
leaves you with an unusable instance with no way to ascertain that it has been unmapped.Arguments against:
MemoryMapRaw
andMemoryMap<T>
that have a very similar API toUnsafeMutableRawPointer
andUnsafeMutablePointer<T>
. The added methods would besync
,unmap
, andisUnmapped
. This is more precise than what currently exists in this PR at the cost of overhead.