最新消息:Welcome to the puzzle paradise for programmers! Here, a well-designed puzzle awaits you. From code logic puzzles to algorithmic challenges, each level is closely centered on the programmer's expertise and skills. Whether you're a novice programmer or an experienced tech guru, you'll find your own challenges on this site. In the process of solving puzzles, you can not only exercise your thinking skills, but also deepen your understanding and application of programming knowledge. Come to start this puzzle journey full of wisdom and challenges, with many programmers to compete with each other and show your programming wisdom! Translated with DeepL.com (free version)

python - Column assignment with .alias() or = - Stack Overflow

matteradmin4PV0评论

What is the preferred way to assign/add a new column to a polars dataframe in .select() or .with_columns()?
Are there any differences between the below column assignments using .alias() or the = sign?

import polars as pl

df = pl.DataFrame({"A": [1, 2, 3],
                   "B": [1, 1, 7]})

df = df.with_columns(pl.col("A").sum().alias("a_sum"), 
                     another_sum=pl.col("A").sum()
                     )

I am not sure which one to use.

What is the preferred way to assign/add a new column to a polars dataframe in .select() or .with_columns()?
Are there any differences between the below column assignments using .alias() or the = sign?

import polars as pl

df = pl.DataFrame({"A": [1, 2, 3],
                   "B": [1, 1, 7]})

df = df.with_columns(pl.col("A").sum().alias("a_sum"), 
                     another_sum=pl.col("A").sum()
                     )

I am not sure which one to use.

Share Improve this question edited Nov 18, 2024 at 17:38 mouwsy asked Nov 18, 2024 at 17:34 mouwsymouwsy 1,99316 silver badges27 bronze badges
Add a comment  | 

2 Answers 2

Reset to default 6

The advantage of alias is that it allows you to specify a column name that wouldn't be a valid Python identifier. For example, you could use "a sum!". This can also be achieved by creating a dictionary and using ** to unpack it, passing the items as keyword arguments.

Assignment with = cannot work in this way, as it requires a valid identifier (e.g., another_sum).

df = df.with_columns(pl.col("A").sum().alias("a sum!"), 
                     another_sum=pl.col("A").sum(),
                     **{":) \u2014 also a sum": pl.col("A").sum()}
                     )

Output:

shape: (3, 5)
┌─────┬─────┬────────┬─────────────┬─────────────────┐
│ A   ┆ B   ┆ a sum! ┆ another_sum ┆ :) — also a sum │
│ --- ┆ --- ┆ ---    ┆ ---         ┆ ---             │
│ i64 ┆ i64 ┆ i64    ┆ i64         ┆ i64             │
╞═════╪═════╪════════╪═════════════╪═════════════════╡
│ 1   ┆ 1   ┆ 6      ┆ 6           ┆ 6               │
│ 2   ┆ 1   ┆ 6      ┆ 6           ┆ 6               │
│ 3   ┆ 7   ┆ 6      ┆ 6           ┆ 6               │
└─────┴─────┴────────┴─────────────┴─────────────────┘

The latter just calls alias for you under the hood:

https://github/pola-rs/polars/blob/a0ec630b25aa847699f9c2d7389fee84749a6491/py-polars/polars/_utils/parse/expr.py#L136-L140

So, there's no advantage to either

If you find = more readable, use that

Post a comment

comment list (0)

  1. No comments so far