Welcome to the Linux Foundation Forum!

Why memory allocated by dma_alloc_coherent() gives better performance ?

I am writing a driver code to read and write from AXI LITE peripheral. Since AXI LITE is a memory mapped interface, I am implementing the mmap function in my driver code. I am doing some experiment by

allocating the memory using dma_alloc_coherent() API and than mapping it to user space by using dma_mmap_cohernet() API {giving Better performance}.
I am mapping the memory to user space by using remap_pfn_range() API along with pgprot_noncached() API to avoid the caching problem {giving poor performance}.
I am seeing better performance while reading and writing in case of dma_alloc_coherent() compare to the remap_pfn_range(). But if i am removing pgprot_noncached() from the code, than performance of memory, mapped by remap_pfn_range also improving but i am seeing mismatch of data, which is because of caching problem.

{provide better performce}

virt = dmam_alloc_coherent(dev,size, &phys, GFP_KERNEL);

dma_mmap_coherent(dev, vma, virt, phys, size);
2.{provide poor performce}

vma->vm_flags |= VM_IO | VM_DONTDUMP;
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
remap_pfn_range(vma, vma->vm_start, mem->start,>>PAGE_SHIFT, size, vma>vm_page_prot);
My doubts are :-

I am unable to understand, why there is a performance hit in case of dma_alloc_coherent ?

I am assuming dma_alloc_coherent gives by default non cached memory while in the other case we are making the memory non cached explicitly by using pgprot_noncahced() API. does pgprot_noncached() API perform some memory operation or translation which leads to overhead ?


Upcoming Training