This repository has been archived by the owner on Jan 7, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 57
Simd32 integration #104
Open
vbajaj1986
wants to merge
368
commits into
intel:master
Choose a base branch
from
vbajaj1986:simd32_integration
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Simd32 integration #104
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Andres Gomez <agomez@igalia.com>
Fixes crash with piglit/bin/map_buffer_range-invalidate CopyBufferSubData \ increment-offset -auto -fbo * Resize the resource storage already when the count is equal to the allocated size, fixes: Invalid write of size 8 at 0xB72E4CF: virgl_drm_add_res (virgl_drm_winsys.c:629) by 0xB72E4CF: virgl_drm_emit_res (virgl_drm_winsys.c:663) by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776) by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585) by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940) by 0x109A1E: upload (invalidate.c:169) by 0x109C2F: piglit_display (invalidate.c:215) by 0x4F80FBE: run_test (piglit_fbo_framework.c:52) by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10949D: main (invalidate.c:47) Address 0xbe07d30 is 0 bytes after a block of size 4,096 alloc'd at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) by 0xB72DAAF: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:567) * Also resize the space allocated for the handles, fixes: Invalid write of size 4 at 0xB72E4F0: virgl_drm_add_res (virgl_drm_winsys.c:631) by 0xB72E4F0: virgl_drm_emit_res (virgl_drm_winsys.c:663) by 0xB72A44A: virgl_encode_resource_copy_region (virgl_encode.c:776) by 0xB40CD12: st_copy_buffer_subdata (st_cb_bufferobjects.c:585) by 0xB244A3B: _mesa_CopyBufferSubData (bufferobj.c:2940) by 0x109A1E: upload (invalidate.c:169) by 0x109C2F: piglit_display (invalidate.c:215) by 0x4F80FBE: run_test (piglit_fbo_framework.c:52) by 0x4F66E5F: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10949D: main (invalidate.c:47) Address 0xbe08570 is 0 bytes after a block of size 2,048 alloc'd at 0x4C2FB0F: malloc ( in /usr/lib/valgrind/vgpreload_memcheck-amd64- linux.so) by 0xB72DAC8: virgl_drm_cmd_buf_create (virgl_drm_winsys.c:572) Fixes: 4b15b5e ("virgl: resize resource bo allocation if we need to.") v2: - Use REALLOC macro and avoid memory leak when re-allocation fails - add Fixes tag (both Emil Velikov) - reorder commit message Signed-off-by: Gert Wollny <gert.wollny@collabora.com> (cherry picked from commit 9b0e8d8)
We require a single version of libdrm for all of our libdrm dependencies (core and driver), but the way this is structured can make the error message less than helpful, as one driver might be the one setting the libdrm requirement, while another might be the one that generates the version failure. This adds a simple message to the output announcing which libdrm module set the version, which might be more helpful. v2: - Use message suggested by Eric Engstrom Fixes: c445b1d ("meson: Use the same version for all libdrm checks") Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (cherry picked from commit d25a27e)
The brw_vs_prog_data::double_inputs_read field comes directly from shader_info::double_inputs which may contain inputs which are not actually read. Instead of using it directly, AND it with inputs_read which is only things which are read. Otherwise, we may end up subtracting too many elements when computing elem_count. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103241 Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 7b26741)
Fix an other regression of mesa: Make gl_vertex_array contain pointers to first order VAO members. The regression showed up with drivers using the tnl module and was reproducible using xonotic-glx -benchmark demos/the-big-keybench.dem. Fixes: 64d2a20 mesa: Make gl_vertex_array contain pointers to first order VAO members. Tested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de> (cherry picked from commit a6232b6)
We should exit from the function 'util_vasprintf' with error code -1 for case where 'malloc' returns NULL Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Fixes: 864148d "util: add util_vasprintf() for Windows (v2)" Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> (cherry picked from commit 65cfe69)
MSDN: "va_end must be called on each argument list that's initialized with va_start or va_copy before the function returns." Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107810 Fixes: c6267eb "gallium/util: Stop bundling our snprintf implementation." Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> (cherry picked from commit 2930b76)
If we have something like: #ifdef NOT_DEFINED #define A_MACRO(x) \ if (x) #endif The # on the #define is not skipped but the define itself is so this then gets recognised as #if. Until 28a3731 this didn't happen because we ended up in <HASH>{NONSPACE} where BEGIN INITIAL was called stopping the problem from happening. This change makes sure we never call RETURN_TOKEN_NEVER_SKIP for if/else/endif when processing a define. Cc: Ian Romanick <idr@freedesktop.org> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107772 Tested-By: Eero Tamminen <eero.t.tamminen@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (cherry picked from commit b9fe8ff)
fixes: This commit was immediately reverted by commit 2dce117. Signed-off-by: Andres Gomez <agomez@igalia.com>
Seems in case of 32-bit library, usage of msse2 makes some stack corruption or incorrect instructions. Usage with mstackrealign fixes that case. v2: Fixed meson. v3: Definition of c_sse2_args moved on the top (L.Landwerlin). Added mstackrealign for Android's mks where msee4.1 is used. v4: Added for Vulkan also. v5: Commit message correction. CC: <mesa-stable@lists.freedesktop.org> Fixes: 6b05c08 (i965: Compile with -msse3) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107779 Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Reviewed-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit d709f12)
Fixes dEQP-GLES3.functional.fragment_ops.blend.default_framebuffer.rgb_func_alpha_func.dst.src_alpha_saturate_src_alpha_saturate and friends with --deqp-egl-config-name=rgb565d0s0 Cc: "18.2" <mesa-stable@lists.freedesktop.org> (cherry picked from commit f73f748)
gen9 hardware has a bug in the sampler cache that can cause GPU hangs whenever an texture with aux compression enabled is in the sampler cache together with an ASTC5x5 texture. Because we can't control what the client binds at any given time, we have two options: resolve the CCS or decompresss the ASTC. Doing a CCS or HiZ resolve is far less drastic and will likely have a smaller performance impact. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com> (cherry picked from commit f9e630e)
Some of the bits of VERTEX_BUFFER_STATE such as access type, instance data step rate, and pitch come from the pipeline. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit c643c5e)
I have no idea if I'm correct about what's going wrong or if this is the correct fix. However, in my multiple weeks of banging my head on this hang, a VUE reference counting bug seems to match all the symptoms and it definitely fixes the hang. Cc: mesa-stable@lists.freedesktop.org Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107280 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit b08b4b2)
Building of 32bit mesa with meson causes issue: "implicit declaration of function ‘__builtin_ia32_clflush’". Fixed by adding msse2 compilation flag. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107843 Fixes: 314879f (i965: Fix asynchronous mappings on !LLC platforms.) Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (cherry picked from commit 97fcccb) [Andres Gomez: resolve trivial conflicts] Signed-off-by: Andres Gomez <agomez@igalia.com> Conflicts: src/intel/tools/meson.build
There were two bugs working together to make things mostly work: I wasn't dividing the VPM output size available by the size of a batch (vertex), but I also had the size of the VPM reduced by a factor of 8. Fixes dEQP-GLES3.functional.vertex_array_objects.all_attributes and it seems also my intermittent varying failures. Fixes: 1561e49 ("v3d: Emit the VCM_CACHE_SIZE packet.") (cherry picked from commit a91b158)
The Vulkan 1.1.81 spec says: "It is legal for offset.x + extent.width or offset.y + extent.height to exceed the dimensions of the framebuffer - the scissor test still applies as defined above. Rasterization does not produce fragments outside of the framebuffer, so such fragments never have the scissor test performed on them." Elsewhere, the Vulkan 1.1.81 spec says: "The application must ensure (using scissor if necessary) that all rendering is contained within the render area, otherwise the pixels outside of the render area become undefined and shader side effects may occur for fragments outside the render area. The render area must be contained within the framebuffer dimensions." Unfortunately, there's some room for interpretation here as to what the consequences are of having the render area set to exactly the framebuffer dimensions and having a scissor that is larger than the framebuffer. Given that GL and other APIs provide automatic clipping to the framebuffer, it makes sense that applications would assume that Vulkan does this as well. It costs us very little to play it safe and just clamp client-provided scissors to the framebuffer dimensions. Fortunately, the user is required to provide us with at least one scissor so we don't need to handle the case where they don't. Fixes: fb2a5ce "anv: Emit DRAWING_RECTANGLE once at driver..." Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit 465e5a8)
VI uses addrlib so it's unaffected. Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit a1b9a00)
… on SI/CI Cc: 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit d4e5228)
Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit da72b62)
important for debugging Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit 662db03)
Cc: 18.2 <mesa-stable@lists.freedesktop.org> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (cherry picked from commit a5f35aa)
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 34a17a4)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> (cherry picked from commit 6f00785)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> CC: 18.2 <mesa-stable@lists.freedesktop.org> (cherry picked from commit f6e09db)
This fixes the situation where we'd send a shader with just the header and no data. piglit/glsl-max-varyings test was causing this to happen, and the renderer fix was breaking it. v2: drop fprintf Fixes: a8987b8 "virgl: add driver for virtio-gpu 3D (v2)" Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> (cherry picked from commit 240af61)
This change makes following test pass: dEQP-VK.api.info.device.extensions Test: dEQP-VK.api.info.device.extensions Signed-off-by: Tapani Pälli <tapani.palli@intel.com> [strassek: carry this patch until the extensions are whitelisted in CTS]
…UsageANDROID Android P and earlier expect that the surface supports storage images, and so many of the tests fail when the framework checks for that support. The framework also includes various image format and usage combinations that are invalid for the hardware. Drop the STORAGE restriction from the HAL and whitelist a pair of formats so that existing versions of Android can pass these tests. Fixes: dEQP-VK.wsi.android.* Signed-off-by: Kevin Strasser <kevin.strasser@intel.com> (am from https://patchwork.freedesktop.org/patch/247681/)
…StorageOES In the same fashion as is done for glEGLImageTextureTarget2D. v2: share the fallback which sets baseformat and internalformat correctly which makes both of the tests pass (Tapani) Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests: #SingleLayer_ColorTest_GpuColorOutputCpuRead_R8G8B8X8_UNORM #SingleLayer_ColorTest_GpuColorOutputIsRenderable_R8G8B8X8_UNORM Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Gurchetan Singh <gurchetansingh@chromium.org> (cherry picked from commit 47e3338)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> (am from https://patchwork.freedesktop.org/patch/210951/)
Set bit when initializing a device. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> (am from https://patchwork.freedesktop.org/patch/210949/)
Set bit when initializing context. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> (am from https://patchwork.freedesktop.org/patch/210950/)
Gen9 hardware requires some workarounds to disable preemption depending on the type of primitive being emitted. We implement this by adding a new atom that tracks BRW_NEW_PRIMITIVE. Whenever it happens, we check the current type of primitive and enable/disable object preemption. For now, we just ignore blorp. The only primitive it emits is 3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can safely leave preemption enabled when it happens. Or it will be disabled by a previous 3DPRIMITIVE, which should be fine too. Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> (am from https://patchwork.freedesktop.org/patch/210952/)
Apparently, we're supposed to look at the texture object's built-in sampler object's sRGB decode setting in order to decide whether to decode/downsample/re-encode, or simply downsample as-is. Previously, I had always done the decoding/encoding. Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> (cherry picked from commit 337a808)
…pport Fixes Skqp's unitTest_EGLImageTest test. For Intel platforms, we support external textures only for EGLImages created with EGL_EXT_image_dma_buf_import. This restriction seems to be Intel specific and not present for other platforms. While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent to the test because of this restriction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301 Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> (cherry picked from commit a5c39ed)
…ort it on gen8+ Dual source blending behaviour is undefined when shader doesn't have second color output, dismissing fragment in such situation leads to a hang on gen8+ if depth test in enabled. Since blending cannot be gracefully fixed in such case and the result is undefined - blending is simply disabled. v2 (Kenneth Graunke): - Listen to BRW_NEW_FS_PROG_DATA in 3DSTATE_PS_BLEND - Also whack BLEND_STATE[] to keep the two in sync, since we're not sure exactly which copy of the redundant info the hardware will use. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107088 Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (cherry picked from commit eca4a65)
Do we have performance data on these changes, what's the impact? |
+kishore
From: Tapani Pälli [mailto:notifications@github.com]
Sent: Thursday, November 22, 2018 1:58 PM
To: intel/external-mesa <external-mesa@noreply.github.com>
Cc: Bajaj, Vineet <vineet.bajaj@intel.com>; Author <author@noreply.github.com>
Subject: Re: [intel/external-mesa] Simd32 integration (#104)
Do we have performance data on these changes, what's the impact?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#104 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AqTv86gNrEV1dFuidwxVyToQktSvt7YKks5uxl_6gaJpZM4Yo7mA>.
|
tpalli
approved these changes
Nov 30, 2018
strassek
approved these changes
Dec 1, 2018
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Has this series changed at all since the RFC [1] Toni posted to mesa-dev? I'd like to maintain links to the upstream submission if possible.
[1] https://patchwork.freedesktop.org/series/51006/
Merged and pushed to master |
These patches are causing visual artifacts on Celadon home screen, so I've included a revert in master branch. If this feature is still needed the issues will need to be resolved. |
strassek
force-pushed
the
master
branch
3 times, most recently
from
July 30, 2019 18:21
a69fa2f
to
6959b05
Compare
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.