Use of unique(col) in lab2

p.hanel · November 2025

In lab2 under 5.0.1 there is a method using col:

There is a line of code calling this method like so:

When the latter is run I get the following error:

If I refer to the finished lab to check for differences it uses a different library, the old torch data datapipes which have a different syntax.

Understanding:
My understanding is that we will pass in one string from cat_attr and call the method for each. Each gen_encoder_dict(series) call is going to run the unique() func and expect col to have a value BUT there is no access to that variable. I did try a couple different things, but I am not 100% sure what the proper fix would be for this since it appears this way in the lab & lesson notes.

Question
Does this code block compile and run on your side?

dvgodoy · November 2025

Hi @p.hanel ,

Apologies for the delayed response.
You're right, the function should read:

def gen_encoder_dict(dataset, col):
    values = sorted(dataset.unique(col))
    values += ['UNKNOWN']
    return dict(zip(values, range(len(values))))

And when it's called, it should be:

dropdown_encoders = {col: gen_encoder_dict(datasets['train'], col) for col in cat_attr}

Moreover, the preproc() function is duplicated. Please ignore the first one. The second - the correct one - is unfortunately missing proper indentation. It should read:

def preproc(row):
    colnames = ['model', 'year', 'price', 'transmission', 'mileage', 'fuel_type', 'road_tax', 'mpg', 'engine_size']#, 'manufacturer']

    cat_attr = ['model', 'transmission', 'fuel_type']
    cont_attr = ['year', 'mileage', 'road_tax', 'mpg', 'engine_size']
    target = 'price'

    cont_X = [float(row[name]) for name in cont_attr]
    cat_X = [dropdown_encoders[name].get(row[name], dropdown_encoders[name]['UNKNOWN']) for name in cat_attr]

    return {'label': np.array([float(row[target])], dtype=np.float32),
            'cont_X': np.array(cont_X, dtype=np.float32), 
            'cat_X': np.array(cat_X, dtype=int)}

For the full picture, you can also look at the corresponding notebook in the course's repository: https://github.com/lftraining/LFD273-code/blob/main/labs/Lab 2.ipynb

Best regards,
Daniel

Use of unique(col) in lab2

Best Answer

Categories

Upcoming Training

Kubernetes Administration (LFS458)

Linux System Administration (LFS301)

Open Source Virtualization (LFS462)

Linux Kernel Debugging and Security (LFD440)