Conversation
b0baef5 to
df29418
Compare
There was a problem hiding this comment.
switched to async remove
71912a3 to
bea834e
Compare
also move rename temp file clean up logic into put_obj method in preparation for moving the temp file creation logic into optimistic delta commit loop.
| r#"^*[/\\]_delta_log[/\\](\d{20})\.checkpoint\.\d{10}\.(\d{10})\.parquet$"# | ||
| ) | ||
| .unwrap(); | ||
| } |
There was a problem hiding this comment.
turns out we can't use join_path to build regex query in a cross-platform way because the separator / from windows is a regex escape character.
| use super::*; | ||
| use std::ffi::CString; | ||
|
|
||
| #[cfg(target_os = "linux")] |
There was a problem hiding this comment.
Linux specific code in the unix-specific implementation? I'm quite confused by why these libc functions need to be mapped in? There's a high likelihood this could break on BSDs, non-glibc implementations.
(I haven't tested it of course, so you may know something I don't)
Isn't there a crate that abstracts this behavior which doesn't require the unsafe below? E.g. https://docs.rs/crate/atomicwrites/0.3.0
Of course, if you've already gone down this path and this is the best possible option, then so be it, I would appreciate some commentary in the code explaining why "this must be"
There was a problem hiding this comment.
platform specific atomic rename needs to be implemented per platform unfortunately, so BSD and non-glibc users will need to send PRs for the platforms they want to use, just like windows users.
I looked into atomicwrites before, their implementation is not ideal. It requires 3 system calls to achieve what can be done in 1 system call, which is the implementation we have here. I also don't think their implementation is correct, because their implementation started with a hardlink function call, which based on the std rust doc, it doesn't error out if destination path already exists.
The unsafe code we have here is scoped to only system call invocation, which is basically what happens when you call std::fs::rename. So even though atomicwrites has less unsafe code on the surface, it's just hidden under the std lib implementation. In short, I think our implementation doesn't have more unsafe code than atomicwrites. If anything, it might have less because we invoke way less system calls.
There was a problem hiding this comment.
why these libc functions need to be mapped in
Because they're missing in libc crate and PR for introducing them is still in open
There was a problem hiding this comment.
well rust-lang/libc#2116 (comment) seems like they're merged this a couple of days ago
| unsafe fn platform_specific_rename(from: *const libc::c_char, to: *const libc::c_char) -> i32 { | ||
| cfg_if::cfg_if! { | ||
| if #[cfg(target_os = "linux")] { | ||
| renameat2(libc::AT_FDCWD, from, libc::AT_FDCWD, to, RENAME_NOREPLACE) |
There was a problem hiding this comment.
Reading the manpage for this function this little bit jumped out to me
RENAME_NOREPLACE requires support from the underlying filesystem.
Emphasis mine of course. I don't have a good alternative suggestion, but just would like to express my worry that there couplings to underlying system implementation details that we cannot clearly communicate to users here.
There was a problem hiding this comment.
good call, added rust doc for filesystem backend to clarify that.
Description
Fix windows build, which was blocking us from releasing a windows
python client. Added mac and window build to the build matrix.
Related Issue(s)
Also move rename temp file clean up logic into put_obj method in
preparation for moving the temp file creation logic into optimistic
delta commit loop. See #135.