Welcome to the Linux Foundation Forum!

Use of unique(col) in lab2

In lab2 under 5.0.1 there is a method using col:

There is a line of code calling this method like so:

When the latter is run I get the following error:

If I refer to the finished lab to check for differences it uses a different library, the old torch data datapipes which have a different syntax.

Understanding:
My understanding is that we will pass in one string from cat_attr and call the method for each. Each gen_encoder_dict(series) call is going to run the unique() func and expect col to have a value BUT there is no access to that variable. I did try a couple different things, but I am not 100% sure what the proper fix would be for this since it appears this way in the lab & lesson notes.

Question
Does this code block compile and run on your side?

Best Answer

  • dvgodoy
    dvgodoy Posts: 13
    Answer ✓

    Hi @p.hanel ,

    Apologies for the delayed response.
    You're right, the function should read:

    def gen_encoder_dict(dataset, col):
        values = sorted(dataset.unique(col))
        values += ['UNKNOWN']
        return dict(zip(values, range(len(values))))
    

    And when it's called, it should be:

    dropdown_encoders = {col: gen_encoder_dict(datasets['train'], col) for col in cat_attr}
    

    Moreover, the preproc() function is duplicated. Please ignore the first one. The second - the correct one - is unfortunately missing proper indentation. It should read:

    def preproc(row):
        colnames = ['model', 'year', 'price', 'transmission', 'mileage', 'fuel_type', 'road_tax', 'mpg', 'engine_size']#, 'manufacturer']
    
        cat_attr = ['model', 'transmission', 'fuel_type']
        cont_attr = ['year', 'mileage', 'road_tax', 'mpg', 'engine_size']
        target = 'price'
    
        cont_X = [float(row[name]) for name in cont_attr]
        cat_X = [dropdown_encoders[name].get(row[name], dropdown_encoders[name]['UNKNOWN']) for name in cat_attr]
    
        return {'label': np.array([float(row[target])], dtype=np.float32),
                'cont_X': np.array(cont_X, dtype=np.float32), 
                'cat_X': np.array(cat_X, dtype=int)}
    

    For the full picture, you can also look at the corresponding notebook in the course's repository: https://github.com/lftraining/LFD273-code/blob/main/labs/Lab 2.ipynb

    Best regards,
    Daniel

Categories

Upcoming Training