fuse-waked: for CAS, hardlinks backed by hardlinks, share virtual mode.#1836
fuse-waked: for CAS, hardlinks backed by hardlinks, share virtual mode.#1836Will Dietz (dtzSiFive) wants to merge 2 commits into
Conversation
f4e96a5 to
0eb963e
Compare
|
|
||
| if (it->second.is_visible(keyt.second)) return -EEXIST; | ||
|
|
||
| if (it->second.is_visible(keyf.second)) return -EACCES; |
There was a problem hiding this comment.
Drive-by: ban hardlinks originating from visible files for lack of structure presently to enforce RO access this way.
Fixes existing bug where you can modify visible files through a hardlink.
Wake snippet demonstrating:
export def datFile Unit =
write "datFile" "hello world"
export def modLink Unit =
require Pass dat =
datFile Unit
makeExecPlan ("sh", "-c", "ln datFile modLink; echo asdf >> modLink; cat datFile", Nil) (dat, Nil)
| setPlanLabel "test (hardlink): modLink"
| setPlanShare False
| runJobWith defaultRunner
| getJobOutput
export def test _ =
require Pass out = modLink Unit
Pass Unit
Note that since we -- before this PR -- block writes through hardlinks in CAS mode, this issue today only exists in non-CAS wake.
cc #1593 .
There was a problem hiding this comment.
This change might deny benign jobs from performing reasonable behavior and should be considered carefully and perhaps separately 👍 .
There was a problem hiding this comment.
Yeah I think this change is right, but don't know if there is any fallout due to this, so agreed that maybe we test and evaluate this in a separate PR
There was a problem hiding this comment.
Actually, I think today attempts to hardlink visible "succeed" but fall through and don't do this in the staging directory but write through to workspace? The new -EEXIST check prevents this, as does the tentative is_visible explicit guard discussed here.
| hardlinks.insert(std::string(to)); | ||
| return 0; | ||
| } | ||
| return -EEXIST; |
There was a problem hiding this comment.
In CAS mode: never fallback to workspace!
This may be unreachable, it should be, but just be explicit here to avoid surprise behaviors while supporting both.
There was a problem hiding this comment.
Would ENONET or EPERM be a better return for if a file is not staged?
|
Dare we set st_nlink based on this? 😉 |
Sam May (ag-eitilt)
left a comment
There was a problem hiding this comment.
This all looks good to me!
| trap 'rm link_file_src.txt link_file_dst.txt' EXIT | ||
|
|
||
| STDOUT=$(${1}/wakebox -p input.json) | ||
| export WAKE_CAS=1 |
There was a problem hiding this comment.
Probably don't want to hardcode this in the test? I know we were saying that we'd rather push toward getting everything unified rather than supporting multiple code paths, but make test ; WAKE_CAS=1 make test should still leave us testing both paths over the full suite.
There was a problem hiding this comment.
Ah, I see what you're getting at with this guard -- this is a breaking change between CAS and non-CAS. Can't say I like the different behaviour, but it's unlikely enough to not cause any issue. Instead of setting WAKE_CAS, though, can we instead check it, print a stderr warning about skipping the test, and then exit 0? (Also applies to #1838)
There was a problem hiding this comment.
Mainline WAKE doesn't propagate WAKE_CAS=1 into the tests, whether this is a bug or not I am not sure. If I was certain I'd have addressed it.
But okay that works for me. I removed it in the multi-wake variant.
Good call!
| struct StagedFileData { | ||
| std::string staging_path; | ||
| mode_t mode; | ||
| std::shared_ptr<mode_t> mode; | ||
|
|
||
| bool is_hardlink() const { return mode.use_count() > 1; } | ||
| }; |
There was a problem hiding this comment.
I don't like this C++ design choice... struct should be data, and if you want methods you reach for class/interface. But that's me getting grumpy at the language, not at this implementation.
There was a problem hiding this comment.
I'll hoist it out to is_hardlink on StagedItem, that fits the current organization/style better anyway. Meant to revisit this, thanks!
Abrar Quazi (AbrarQuazi)
left a comment
There was a problem hiding this comment.
LGTM, thanks!
We should merge this to master and then maybe cherry pick to the feature/multi-wake branch so we can get to a beta release faster
| hardlinks.insert(std::string(to)); | ||
| return 0; | ||
| } | ||
| return -EEXIST; |
There was a problem hiding this comment.
Would ENONET or EPERM be a better return for if a file is not staged?
|
|
||
| if (it->second.is_visible(keyt.second)) return -EEXIST; | ||
|
|
||
| if (it->second.is_visible(keyf.second)) return -EACCES; |
There was a problem hiding this comment.
Yeah I think this change is right, but don't know if there is any fallout due to this, so agreed that maybe we test and evaluate this in a separate PR
| it->second.files_wrote.insert(keyt.second); | ||
| // Both hardlink paths need direct_io to prevent kernel caching issues | ||
| hardlinks.insert(std::string(from)); | ||
| hardlinks.insert(std::string(to)); |
There was a problem hiding this comment.
Can the hardlinks set be completely removed? Or its still needed because the non-CAS mode still depends on it? If thats the case, maybe add a comment to remove this structure
There was a problem hiding this comment.
It's for non-CAS. I'll add a comment.
|
Mixed signals on where we want to send these changes hahaha. Applied y'all's feedback to multi-wake version. I'll sync those changes back here later. |
Modify hardlink test to run with CAS enabled, which before this change would cause it to fail (cannot write through hardlinks).
Extend test to check mode changes on one are observed on the other.