Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Faster repr() logic for String, StringRef, UInt, DType and others #3736

Closed
wants to merge 20 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
19b23a1
Implement StringRef.__repr__()
msaelices Nov 3, 2024
950d36a
[stdlib] Faster path.stat() implementation using Writer
msaelices Nov 3, 2024
c6c9277
Optimize String.__repr__() by using Writer
msaelices Nov 3, 2024
f1472b9
[stdlib] Use StringRepr for StringRef.__repr__()
msaelices Nov 3, 2024
3fbb842
Add changelog entry
msaelices Nov 3, 2024
3c2129f
[stdlib] Faster UInt.__repr__()
msaelices Nov 7, 2024
91f0b80
[stdlib] Faster Error.__repr__()
msaelices Nov 7, 2024
db15abe
[stdlib] Faster DType.__repr__()
msaelices Nov 7, 2024
348b434
[stdlib] Simpler stat_result.__repr__() logic
msaelices Nov 7, 2024
97a7a96
[stdlib] Hopefully fix the test_stat() test in MacOS
msaelices Nov 7, 2024
4b67bfc
Merge branch 'nightly' into better-representable
msaelices Nov 7, 2024
40e2787
[stdlib] Simpler String.__repr__() logic
msaelices Nov 7, 2024
696339c
[stdlib] Remove two failing checks in test_stat() which depends on th…
msaelices Nov 7, 2024
fb93c6b
Merge branch 'nightly' into better-representable
msaelices Nov 8, 2024
7e4da67
Merge branch 'nightly' into better-representable
msaelices Nov 12, 2024
3ab5ef0
[stdlib] Adapt the code to s/unsafe_from_utf8_ptr/ptr refactoring
msaelices Nov 12, 2024
32a507d
[stdlib] Replace StringRepr to _StringRepr
msaelices Nov 12, 2024
1ee14b1
Fix typo in _StringRepr docstring
msaelices Nov 12, 2024
cb66ec9
[stdlib] Do not use _StringRepr for StringRef.__repr__()
msaelices Nov 12, 2024
958106b
[stdlib] Revert the StringRepr logic as there is a much better approach
msaelices Nov 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@ what we publish.

### ⭐️ New

- `StringRef` is now representable so `repr(StringRef("hello"))` will return
`StringRef('hello')`.

- Mojo can now interpret simple LLVM intrinsics in parameter expressions,
enabling things like `count_leading_zeros` to work at compile time:
[Issue #933](https://github.com/modularml/mojo/issues/933).
Expand Down
2 changes: 1 addition & 1 deletion stdlib/src/builtin/dtype.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ struct DType(
Returns:
The representation of the dtype.
"""
return "DType." + str(self)
return String.write("DType.", self)

@always_inline("nodebug")
fn get_value(self) -> __mlir_type.`!kgen.dtype`:
Expand Down
2 changes: 1 addition & 1 deletion stdlib/src/builtin/error.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -180,7 +180,7 @@ struct Error(
Returns:
A printable representation of the error message.
"""
return "Error(" + repr(self._message()) + ")"
return String.write("Error(", repr(self._message()), ")")

# ===-------------------------------------------------------------------===#
# Methods
Expand Down
2 changes: 1 addition & 1 deletion stdlib/src/builtin/uint.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ struct UInt(IntLike, _HashableWithHasher):
Returns:
The string representation of this UInt.
"""
return "UInt(" + str(self) + ")"
return String.write("UInt(", str(self), ")")

fn __hash__(self) -> UInt:
"""Hash the UInt using builtin hash.
Expand Down
49 changes: 31 additions & 18 deletions stdlib/src/os/fstat.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ fn _constrain_unix():


@value
struct stat_result(Stringable):
struct stat_result(Stringable, Writable):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion Split this change to stat_result to be Writable along with its tests to a separate PR, please.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal of this commit was to have a faster representation of path.stat() which uses stat_result underneath, which was already representable but slow, so to me was in the same scope of "Faster repr() logic" msaelices@950d36a

Do you still want me to split those changes into a new PR?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the scope of the commit, but in the future, I'd recommend teasing apart the two ideas (applying Writable to a struct and making repr() faster for a few types). They're orthogonal things in my mind and could have been two separate PRs. It's fine for this one though, but something to consider in the future 😄

"""Object whose fields correspond to the members of the stat structure."""

var st_mode: Int
Expand Down Expand Up @@ -150,30 +150,43 @@ struct stat_result(Stringable):
self.st_rdev = st_rdev
self.st_flags = st_flags

@no_inline
fn write_to[W: Writer](self, inout writer: W):
"""
Formats this path to the provided Writer.

Parameters:
W: A type conforming to the Writable trait.

Args:
writer: The object to write to.
"""
writer.write("os.stat_result(")
writer.write("st_mode=", self.st_mode)
writer.write(", st_ino=", self.st_ino)
writer.write(", st_dev=", self.st_dev)
writer.write(", st_nlink=", self.st_nlink)
writer.write(", st_uid=", self.st_uid)
writer.write(", st_gid=", self.st_gid)
writer.write(", st_size=", self.st_size)
writer.write(", st_atime=", str(self.st_atimespec))
writer.write(", st_mtime=", str(self.st_mtimespec))
writer.write(", st_ctime=", str(self.st_ctimespec))
writer.write(", st_birthtime=", str(self.st_birthtimespec))
writer.write(", st_blocks=", self.st_blocks)
writer.write(", st_blksize=", self.st_blksize)
writer.write(", st_rdev=", self.st_rdev)
writer.write(", st_flags=", self.st_flags)
writer.write(")")
Comment on lines +164 to +180
Copy link
Contributor

@martinvuyk martinvuyk Nov 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion:
The String.write_bytes() reserves the amount of bytes to join the next item every time, in the future we should be able to constrain traits and know if they implement for example as_bytes() and be able to reserve the total amount of bytes beforehand. So this suggestion doesn't make a difference currently but maybe in the future:

these can all go in the same function call separated by commas. They will be formatted similarly by the Mojo formatter so it won't make much difference in the aesthetics other than the name of the field won't be directly to the left, still it's just a matter of personal preference and not performance currently 🤷‍♂️ .

(This would all become redundant if somebody reserves enough capacity for the whole os.stat result before calling write_to())


@no_inline
fn __str__(self) -> String:
"""Constructs a string representation of stat_result.

Returns:
A string representation of stat_result.
"""
var res = String("os.stat_result(")
res += "st_mode=" + str(self.st_mode)
res += ", st_ino=" + str(self.st_ino)
res += ", st_dev=" + str(self.st_dev)
res += ", st_nlink=" + str(self.st_nlink)
res += ", st_uid=" + str(self.st_uid)
res += ", st_gid=" + str(self.st_gid)
res += ", st_size=" + str(self.st_size)
res += ", st_atime=" + str(self.st_atimespec)
res += ", st_mtime=" + str(self.st_mtimespec)
res += ", st_ctime=" + str(self.st_ctimespec)
res += ", st_birthtime=" + str(self.st_birthtimespec)
res += ", st_blocks=" + str(self.st_blocks)
res += ", st_blksize=" + str(self.st_blksize)
res += ", st_rdev=" + str(self.st_rdev)
res += ", st_flags=" + str(self.st_flags)
return res + ")"
return String.write(self)

fn __repr__(self) -> String:
"""Constructs a representation of stat_result.
Expand Down
22 changes: 16 additions & 6 deletions stdlib/src/utils/stringref.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -41,17 +41,18 @@ fn _align_down(value: Int, alignment: Int) -> Int:
@value
@register_passable("trivial")
struct StringRef(
Sized,
IntableRaising,
AsBytes,
Boolable,
CollectionElement,
CollectionElementNew,
Comparable,
Hashable,
IntableRaising,
Representable,
Sized,
Stringable,
Writable,
Hashable,
_HashableWithHasher,
Boolable,
Comparable,
AsBytes,
):
"""
Represent a constant reference to a string, i.e. a sequence of characters
Expand Down Expand Up @@ -396,6 +397,15 @@ struct StringRef(
"""
return String.write(self)

@no_inline
fn __repr__(self) -> String:
"""Convert the string reference to a string.

Returns:
The String representation of the StringRef.
"""
return String.write("StringRef(", repr(str(self)), ")")

@no_inline
fn write_to[W: Writer](self, inout writer: W):
"""
Expand Down
29 changes: 29 additions & 0 deletions stdlib/test/pathlib/test_pathlib.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,34 @@ def test_home():
set_home(original_home)


def test_stat():
var path = Path(__source_location().file_name)
var stat = path.stat()
assert_equal(
str(stat),
"os.stat_result(st_mode={}, st_ino={}, st_dev={}, st_nlink={},"
" st_uid={}, st_gid={}, st_size={}, st_atime={}, st_mtime={},"
" st_ctime={}, st_birthtime={}, st_blocks={}, st_blksize={},"
" st_rdev={}, st_flags={})".format(
stat.st_mode,
stat.st_ino,
stat.st_dev,
stat.st_nlink,
stat.st_uid,
stat.st_gid,
stat.st_size,
str(stat.st_atimespec),
str(stat.st_mtimespec),
str(stat.st_ctimespec),
str(stat.st_birthtimespec),
stat.st_blocks,
stat.st_blksize,
stat.st_rdev,
stat.st_flags,
),
)


def main():
test_cwd()
test_path()
Expand All @@ -159,3 +187,4 @@ def main():
test_read_write()
test_expand_user()
test_home()
test_stat()
20 changes: 20 additions & 0 deletions stdlib/test/utils/test_stringref.mojo
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,25 @@ fn test_stringref_split() raises:
assert_equal(res4[1], "o")


def test_str_and_ref():
assert_equal(StringRef("abc").__str__(), "abc")
assert_equal(StringRef("abc").__repr__(), "StringRef('abc')")
assert_equal(StringRef("\0").__repr__(), r"StringRef('\x00')")
assert_equal(StringRef("\x09").__repr__(), r"StringRef('\t')")
assert_equal(StringRef("\n").__repr__(), r"StringRef('\n')")
assert_equal(StringRef("\x0d").__repr__(), r"StringRef('\r')")
assert_equal(StringRef("'").__repr__(), 'StringRef("\'")')

# Multi-byte characters.__repr__()
assert_equal(
StringRef("Örnsköldsvik").__repr__(), "StringRef('Örnsköldsvik')"
) # 2-byte
assert_equal(StringRef("你好!").__repr__(), "StringRef('你好!')") # 3-byte
assert_equal(
StringRef("hello 🔥!").__repr__(), "StringRef('hello 🔥!')"
) # 4-byte


def main():
test_strref_from_start()
test_stringref_split()
Expand All @@ -177,3 +196,4 @@ def main():
test_indexing()
test_find()
test_endswith()
test_str_and_ref()