1
0
Fork 0
mutter-performance-source/cogl/cogl-journal-private.h

115 lines
3.7 KiB
C
Raw Normal View History

/*
* Cogl
*
* An object oriented GL/GLES Abstraction/Utility Layer
*
* Copyright (C) 2007,2008,2009 Intel Corporation.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library. If not, see <http://www.gnu.org/licenses/>.
*
*
*/
#ifndef __COGL_JOURNAL_PRIVATE_H
#define __COGL_JOURNAL_PRIVATE_H
Change API so that CoglPixelBuffer no longer knows its w/h/format The idea is that CoglPixelBuffer should just be a buffer that can be used for pixel data and it has no idea about the details of any images that are stored in it. This is analogous to CoglAttributeBuffer which itself does not have any information about the attributes. When you want to use a pixel buffer you should create a CoglBitmap which points to a region of the attribute buffer and provides the extra needed information such as the width, height and format. That way it is also possible to use a single CoglPixelBuffer with multiple bitmaps. The changes that are made are: • cogl_pixel_buffer_new_with_size has been removed and in its place is cogl_bitmap_new_with_size. This will create a pixel buffer at the right size and rowstride for the given width/height/format and immediately create a single CoglBitmap to point into it. The old function had an out-parameter for the stride of the image but with the new API this should be queriable from the bitmap (although there is no function for this yet). • There is now a public cogl_pixel_buffer_new constructor. This takes a size in bytes and data pointer similarly to cogl_attribute_buffer_new. • cogl_texture_new_from_buffer has been removed. If you want to create a texture from a pixel buffer you should wrap it up in a bitmap first. There is already API to create a texture from a bitmap. This patch also does a bit of header juggling because cogl-context.h was including cogl-texture.h and cogl-framebuffer.h which were causing some circular dependencies when cogl-bitmap.h includes cogl-context.h. These weren't actually needed in cogl-context.h itself but a few other headers were relying on them being included so this adds the #includes where necessary. Reviewed-by: Robert Bragg <robert@linux.intel.com>
2012-02-25 20:04:45 +00:00
#include "cogl-texture.h"
#include "cogl-object-private.h"
#include "cogl-clip-stack.h"
#include "cogl-fence-private.h"
#define COGL_JOURNAL_VBO_POOL_SIZE 8
typedef struct _CoglJournal
{
CoglObject _parent;
/* A pointer the framebuffer that is using this journal. This is
only valid when the journal is not empty. It *does* take a
reference on the framebuffer. Although this creates a circular
reference, the framebuffer has special code to handle the case
where the journal is the only thing holding a reference and it
will cause the journal to flush */
CoglFramebuffer *framebuffer;
GArray *entries;
GArray *vertices;
size_t needed_vbo_len;
/* A pool of attribute buffers is used so that we can avoid repeatedly
reallocating buffers. Only one of these buffers at a time will be
used by Cogl but we keep more than one alive anyway in case the
GL driver is internally using the buffer and it would have to
allocate a new one when we start writing to it */
CoglAttributeBuffer *vbo_pool[COGL_JOURNAL_VBO_POOL_SIZE];
/* The next vbo to use from the pool. We just cycle through them in
order */
unsigned int next_vbo_in_pool;
cogl: Implements a software only read-pixel fast-path This adds a transparent optimization to cogl_read_pixels for when a single pixel is being read back and it happens that all the geometry of the current frame is still available in the framebuffer's associated journal. The intention is to indirectly optimize Clutter's render based picking mechanism in such a way that the 99% of cases where scenes are comprised of trivial quad primitives that can easily be intersected we can avoid the latency of kicking a GPU render and blocking for the result when we know we can calculate the result manually on the CPU probably faster than we could even kick a render. A nice property of this solution is that it maintains all the flexibility of the render based picking provided by Clutter and it can gracefully fall back to GPU rendering if actors are drawn using anything more complex than a quad for their geometry. It seems worth noting that there is a limitation to the extensibility of this approach in that it can only optimize picking a against geometry that passes through Cogl's journal which isn't something Clutter directly controls. For now though this really doesn't matter since basically all apps should end up hitting this fast-path. The current idea to address this longer term would be a pick2 vfunc for ClutterActor that can support geometry and render based input regions of actors and move this optimization up into Clutter instead. Note: currently we don't have a primitive count threshold to consider that there could be scenes with enough geometry for us to compensate for the cost of kicking a render and determine a result more efficiently by utilizing the GPU. We don't currently expect this to be common though. Note: in the future it could still be interesting to revive something like the wip/async-pbo-picking branch to provide an asynchronous read-pixels based optimization for Clutter picking in cases where more complex input regions that necessitate rendering are in use or if we do add a threshold for rendering as mentioned above.
2011-01-12 22:12:41 +00:00
int fast_read_pixel_count;
Use the Wayland embedded linked list implementation instead of BSD's This removes cogl-queue.h and adds a copy of Wayland's embedded list implementation. The advantage of the Wayland model is that it is much simpler and so it is easier to follow. It also doesn't require defining a typedef for every list type. The downside is that there is only one list type which is a doubly-linked list where the head has a pointer to both the beginning and the end. The BSD implementation has many more combinations some of which we were taking advantage of to reduce the size of critical structs where we didn't need a pointer to the end of the list. The corresponding changes to uses of cogl-queue.h are: • COGL_STAILQ_* was used for onscreen the list of events and dirty notifications. This makes the size of the CoglContext grow by one pointer. • COGL_TAILQ_* was used for fences. • COGL_LIST_* for CoglClosures. In this case the list head now has an extra pointer which means CoglOnscreen will grow by the size of three pointers, but this doesn't seem like a particularly important struct to optimise for size anyway. • COGL_LIST_* was used for the list of foreign GLES2 offscreens. • COGL_TAILQ_* was used for the list of sub stacks in a CoglMemoryStack. • COGL_LIST_* was used to track the list of layers that haven't had code generated yet while generating a fragment shader for a pipeline. • COGL_LIST_* was used to track the pipeline hierarchy in CoglNode. The last part is a bit more controversial because it increases the size of CoglPipeline and CoglPipelineLayer by one pointer in order to have the redundant tail pointer for the list head. Normally we try to be very careful about the size of the CoglPipeline struct. Because CoglPipeline is slice-allocated, this effectively ends up adding two pointers to the size because GSlice rounds up to the size of two pointers. Reviewed-by: Robert Bragg <robert@linux.intel.com> (cherry picked from commit 13abf613b15f571ba1fcf6d2eb831ffc6fa31324) Conflicts: cogl/cogl-context-private.h cogl/cogl-context.c cogl/driver/gl/cogl-pipeline-fragend-glsl.c doc/reference/cogl-2.0-experimental/Makefile.am
2013-06-08 22:03:25 +00:00
CoglList pending_fences;
} CoglJournal;
/* To improve batching of geometry when submitting vertices to OpenGL we
* log the texture rectangles we want to draw to a journal, so when we
* later flush the journal we aim to batch data, and gl draw calls. */
typedef struct _CoglJournalEntry
{
cogl: rename CoglMaterial -> CoglPipeline This applies an API naming change that's been deliberated over for a while now which is to rename CoglMaterial to CoglPipeline. For now the new pipeline API is marked as experimental and public headers continue to talk about materials not pipelines. The CoglMaterial API is now maintained in terms of the cogl_pipeline API internally. Currently this API is targeting Cogl 2.0 so we will have time to integrate it properly with other upcoming Cogl 2.0 work. The basic reasons for the rename are: - That the term "material" implies to many people that they are constrained to fragment processing; perhaps as some kind of high-level texture abstraction. - In Clutter they get exposed by ClutterTexture actors which may be re-inforcing this misconception. - When comparing how other frameworks use the term material, a material sometimes describes a multi-pass fragment processing technique which isn't the case in Cogl. - In code, "CoglPipeline" will hopefully be a much more self documenting summary of what these objects represent; a full GPU pipeline configuration including, for example, vertex processing, fragment processing and blending. - When considering the API documentation story, at some point we need a document introducing developers to how the "GPU pipeline" works so it should become intuitive that CoglPipeline maps back to that description of the GPU pipeline. - This is consistent in terminology and concept to OpenGL 4's new pipeline object which is a container for program objects. Note: The cogl-material.[ch] files have been renamed to cogl-material-compat.[ch] because otherwise git doesn't seem to treat the change as a moving the old cogl-material.c->cogl-pipeline.c and so we loose all our git-blame history.
2010-10-27 17:54:57 +00:00
CoglPipeline *pipeline;
Re-design the matrix stack using a graph of ops This re-designs the matrix stack so we now keep track of each separate operation such as rotating, scaling, translating and multiplying as immutable, ref-counted nodes in a graph. Being a "graph" here means that different transformations composed of a sequence of linked operation nodes may share nodes. The first node in a matrix-stack is always a LOAD_IDENTITY operation. As an example consider if an application where to draw three rectangles A, B and C something like this: cogl_framebuffer_scale (fb, 2, 2, 2); cogl_framebuffer_push_matrix(fb); cogl_framebuffer_translate (fb, 10, 0, 0); cogl_framebuffer_push_matrix(fb); cogl_framebuffer_rotate (fb, 45, 0, 0, 1); cogl_framebuffer_draw_rectangle (...); /* A */ cogl_framebuffer_pop_matrix(fb); cogl_framebuffer_draw_rectangle (...); /* B */ cogl_framebuffer_pop_matrix(fb); cogl_framebuffer_push_matrix(fb); cogl_framebuffer_set_modelview_matrix (fb, &mv); cogl_framebuffer_draw_rectangle (...); /* C */ cogl_framebuffer_pop_matrix(fb); That would result in a graph of nodes like this: LOAD_IDENTITY | SCALE / \ SAVE LOAD | | TRANSLATE RECTANGLE(C) | \ SAVE RECTANGLE(B) | ROTATE | RECTANGLE(A) Each push adds a SAVE operation which serves as a marker to rewind too when a corresponding pop is issued and also each SAVE node may also store a cached matrix representing the composition of all its ancestor nodes. This means if we repeatedly need to resolve a real CoglMatrix for a given node then we don't need to repeat the composition. Some advantages of this design are: - A single pointer to any node in the graph can now represent a complete, immutable transformation that can be logged for example into a journal. Previously we were storing a full CoglMatrix in each journal entry which is 16 floats for the matrix itself as well as space for flags and another 16 floats for possibly storing a cache of the inverse. This means that we significantly reduce the size of the journal when drawing lots of primitives and we also avoid copying over 128 bytes per entry. - It becomes much cheaper to check for equality. In cases where some (unlikely) false negatives are allowed simply comparing the pointers of two matrix stack graph entries is enough. Previously we would use memcmp() to compare matrices. - It becomes easier to do comparisons of transformations. By looking for the common ancestry between nodes we can determine the operations that differentiate the transforms and use those to gain a high level understanding of the differences. For example we use this in the journal to be able to efficiently determine when two rectangle transforms only differ by some translation so that we can perform software clipping. Reviewed-by: Neil Roberts <neil@linux.intel.com> (cherry picked from commit f75aee93f6b293ca7a7babbd8fcc326ee6bf7aef)
2012-02-20 15:59:48 +00:00
CoglMatrixEntry *modelview_entry;
CoglClipStack *clip_stack;
/* Offset into ctx->logged_vertices */
size_t array_offset;
int n_layers;
} CoglJournalEntry;
CoglJournal *
_cogl_journal_new (CoglFramebuffer *framebuffer);
void
_cogl_journal_log_quad (CoglJournal *journal,
const float *position,
cogl: rename CoglMaterial -> CoglPipeline This applies an API naming change that's been deliberated over for a while now which is to rename CoglMaterial to CoglPipeline. For now the new pipeline API is marked as experimental and public headers continue to talk about materials not pipelines. The CoglMaterial API is now maintained in terms of the cogl_pipeline API internally. Currently this API is targeting Cogl 2.0 so we will have time to integrate it properly with other upcoming Cogl 2.0 work. The basic reasons for the rename are: - That the term "material" implies to many people that they are constrained to fragment processing; perhaps as some kind of high-level texture abstraction. - In Clutter they get exposed by ClutterTexture actors which may be re-inforcing this misconception. - When comparing how other frameworks use the term material, a material sometimes describes a multi-pass fragment processing technique which isn't the case in Cogl. - In code, "CoglPipeline" will hopefully be a much more self documenting summary of what these objects represent; a full GPU pipeline configuration including, for example, vertex processing, fragment processing and blending. - When considering the API documentation story, at some point we need a document introducing developers to how the "GPU pipeline" works so it should become intuitive that CoglPipeline maps back to that description of the GPU pipeline. - This is consistent in terminology and concept to OpenGL 4's new pipeline object which is a container for program objects. Note: The cogl-material.[ch] files have been renamed to cogl-material-compat.[ch] because otherwise git doesn't seem to treat the change as a moving the old cogl-material.c->cogl-pipeline.c and so we loose all our git-blame history.
2010-10-27 17:54:57 +00:00
CoglPipeline *pipeline,
int n_layers,
Add a strong CoglTexture type to replace CoglHandle As part of the on going, incremental effort to purge the non type safe CoglHandle type from the Cogl API this patch tackles most of the CoglHandle uses relating to textures. We'd postponed making this change for quite a while because we wanted to have a clearer understanding of how we wanted to evolve the texture APIs towards Cogl 2.0 before exposing type safety here which would be difficult to change later since it would imply breaking APIs. The basic idea that we are steering towards now is that CoglTexture can be considered to be the most primitive interface we have for any object representing a texture. The texture interface would provide roughly these methods: cogl_texture_get_width cogl_texture_get_height cogl_texture_can_repeat cogl_texture_can_mipmap cogl_texture_generate_mipmap; cogl_texture_get_format cogl_texture_set_region cogl_texture_get_region Besides the texture interface we will then start to expose types corresponding to specific texture types: CoglTexture2D, CoglTexture3D, CoglTexture2DSliced, CoglSubTexture, CoglAtlasTexture and CoglTexturePixmapX11. We will then also expose an interface for the high-level texture types we have (such as CoglTexture2DSlice, CoglSubTexture and CoglAtlasTexture) called CoglMetaTexture. CoglMetaTexture is an additional interface that lets you iterate a virtual region of a meta texture and get mappings of primitive textures to sub-regions of that virtual region. Internally we already have this kind of abstraction for dealing with sliced texture, sub-textures and atlas textures in a consistent way, so this will just make that abstraction public. The aim here is to clarify that there is a difference between primitive textures (CoglTexture2D/3D) and some of the other high-level textures, and also enable developers to implement primitives that can support meta textures since they can only be used with the cogl_rectangle API currently. The thing that's not so clean-cut with this are the texture constructors we have currently; such as cogl_texture_new_from_file which no longer make sense when CoglTexture is considered to be an interface. These will basically just become convenient factory functions and it's just a bit unusual that they are within the cogl_texture namespace. It's worth noting here that all the texture type APIs will also have their own type specific constructors so these functions will only be used for the convenience of being able to create a texture without really wanting to know the details of what type of texture you need. Longer term for 2.0 we may come up with replacement names for these factory functions or the other thing we are considering is designing some asynchronous factory functions instead since it's so often detrimental to application performance to be blocked waiting for a texture to be uploaded to the GPU. Reviewed-by: Neil Roberts <neil@linux.intel.com>
2011-08-24 20:30:34 +00:00
CoglTexture *layer0_override_texture,
const float *tex_coords,
unsigned int tex_coords_len);
void
_cogl_journal_flush (CoglJournal *journal);
cogl: Implements a software only read-pixel fast-path This adds a transparent optimization to cogl_read_pixels for when a single pixel is being read back and it happens that all the geometry of the current frame is still available in the framebuffer's associated journal. The intention is to indirectly optimize Clutter's render based picking mechanism in such a way that the 99% of cases where scenes are comprised of trivial quad primitives that can easily be intersected we can avoid the latency of kicking a GPU render and blocking for the result when we know we can calculate the result manually on the CPU probably faster than we could even kick a render. A nice property of this solution is that it maintains all the flexibility of the render based picking provided by Clutter and it can gracefully fall back to GPU rendering if actors are drawn using anything more complex than a quad for their geometry. It seems worth noting that there is a limitation to the extensibility of this approach in that it can only optimize picking a against geometry that passes through Cogl's journal which isn't something Clutter directly controls. For now though this really doesn't matter since basically all apps should end up hitting this fast-path. The current idea to address this longer term would be a pick2 vfunc for ClutterActor that can support geometry and render based input regions of actors and move this optimization up into Clutter instead. Note: currently we don't have a primitive count threshold to consider that there could be scenes with enough geometry for us to compensate for the cost of kicking a render and determine a result more efficiently by utilizing the GPU. We don't currently expect this to be common though. Note: in the future it could still be interesting to revive something like the wip/async-pbo-picking branch to provide an asynchronous read-pixels based optimization for Clutter picking in cases where more complex input regions that necessitate rendering are in use or if we do add a threshold for rendering as mentioned above.
2011-01-12 22:12:41 +00:00
void
_cogl_journal_discard (CoglJournal *journal);
CoglBool
cogl: Implements a software only read-pixel fast-path This adds a transparent optimization to cogl_read_pixels for when a single pixel is being read back and it happens that all the geometry of the current frame is still available in the framebuffer's associated journal. The intention is to indirectly optimize Clutter's render based picking mechanism in such a way that the 99% of cases where scenes are comprised of trivial quad primitives that can easily be intersected we can avoid the latency of kicking a GPU render and blocking for the result when we know we can calculate the result manually on the CPU probably faster than we could even kick a render. A nice property of this solution is that it maintains all the flexibility of the render based picking provided by Clutter and it can gracefully fall back to GPU rendering if actors are drawn using anything more complex than a quad for their geometry. It seems worth noting that there is a limitation to the extensibility of this approach in that it can only optimize picking a against geometry that passes through Cogl's journal which isn't something Clutter directly controls. For now though this really doesn't matter since basically all apps should end up hitting this fast-path. The current idea to address this longer term would be a pick2 vfunc for ClutterActor that can support geometry and render based input regions of actors and move this optimization up into Clutter instead. Note: currently we don't have a primitive count threshold to consider that there could be scenes with enough geometry for us to compensate for the cost of kicking a render and determine a result more efficiently by utilizing the GPU. We don't currently expect this to be common though. Note: in the future it could still be interesting to revive something like the wip/async-pbo-picking branch to provide an asynchronous read-pixels based optimization for Clutter picking in cases where more complex input regions that necessitate rendering are in use or if we do add a threshold for rendering as mentioned above.
2011-01-12 22:12:41 +00:00
_cogl_journal_all_entries_within_bounds (CoglJournal *journal,
float clip_x0,
float clip_y0,
float clip_x1,
float clip_y1);
CoglBool
cogl: Implements a software only read-pixel fast-path This adds a transparent optimization to cogl_read_pixels for when a single pixel is being read back and it happens that all the geometry of the current frame is still available in the framebuffer's associated journal. The intention is to indirectly optimize Clutter's render based picking mechanism in such a way that the 99% of cases where scenes are comprised of trivial quad primitives that can easily be intersected we can avoid the latency of kicking a GPU render and blocking for the result when we know we can calculate the result manually on the CPU probably faster than we could even kick a render. A nice property of this solution is that it maintains all the flexibility of the render based picking provided by Clutter and it can gracefully fall back to GPU rendering if actors are drawn using anything more complex than a quad for their geometry. It seems worth noting that there is a limitation to the extensibility of this approach in that it can only optimize picking a against geometry that passes through Cogl's journal which isn't something Clutter directly controls. For now though this really doesn't matter since basically all apps should end up hitting this fast-path. The current idea to address this longer term would be a pick2 vfunc for ClutterActor that can support geometry and render based input regions of actors and move this optimization up into Clutter instead. Note: currently we don't have a primitive count threshold to consider that there could be scenes with enough geometry for us to compensate for the cost of kicking a render and determine a result more efficiently by utilizing the GPU. We don't currently expect this to be common though. Note: in the future it could still be interesting to revive something like the wip/async-pbo-picking branch to provide an asynchronous read-pixels based optimization for Clutter picking in cases where more complex input regions that necessitate rendering are in use or if we do add a threshold for rendering as mentioned above.
2011-01-12 22:12:41 +00:00
_cogl_journal_try_read_pixel (CoglJournal *journal,
int x,
int y,
CoglBitmap *bitmap,
CoglBool *found_intersection);
cogl: Implements a software only read-pixel fast-path This adds a transparent optimization to cogl_read_pixels for when a single pixel is being read back and it happens that all the geometry of the current frame is still available in the framebuffer's associated journal. The intention is to indirectly optimize Clutter's render based picking mechanism in such a way that the 99% of cases where scenes are comprised of trivial quad primitives that can easily be intersected we can avoid the latency of kicking a GPU render and blocking for the result when we know we can calculate the result manually on the CPU probably faster than we could even kick a render. A nice property of this solution is that it maintains all the flexibility of the render based picking provided by Clutter and it can gracefully fall back to GPU rendering if actors are drawn using anything more complex than a quad for their geometry. It seems worth noting that there is a limitation to the extensibility of this approach in that it can only optimize picking a against geometry that passes through Cogl's journal which isn't something Clutter directly controls. For now though this really doesn't matter since basically all apps should end up hitting this fast-path. The current idea to address this longer term would be a pick2 vfunc for ClutterActor that can support geometry and render based input regions of actors and move this optimization up into Clutter instead. Note: currently we don't have a primitive count threshold to consider that there could be scenes with enough geometry for us to compensate for the cost of kicking a render and determine a result more efficiently by utilizing the GPU. We don't currently expect this to be common though. Note: in the future it could still be interesting to revive something like the wip/async-pbo-picking branch to provide an asynchronous read-pixels based optimization for Clutter picking in cases where more complex input regions that necessitate rendering are in use or if we do add a threshold for rendering as mentioned above.
2011-01-12 22:12:41 +00:00
CoglBool
Add -Wmissing-declarations to maintainer flags and fix problems This option to GCC makes it give a warning whenever a global function is defined without a declaration. This should catch cases were we've defined a function but forgot to put it in a header. In that case it is either only used within one file so we should make it static or we should declare it in a header. The following changes where made to fix problems: • Some functions were made static • cogl-path.h (the one containing the 1.0 API) was split into two files, one defining the functions and one defining the enums so that cogl-path.c can include the enum and function declarations from the 2.0 API as well as the function declarations from the 1.0 API. • cogl2-clip-state has been removed. This only had one experimental function called cogl_clip_push_from_path but as this is unstable we might as well remove it favour of the equivalent cogl_framebuffer_* API. • The GLX, SDL and WGL winsys's now have a private header to define their get_vtable function instead of directly declaring in the C file where it is called. • All places that were calling COGL_OBJECT_DEFINE need to have the cogl_is_whatever function declared so these have been added either as a public function or in a private header. • Some files that were not including the header containing their function declarations have been fixed to do so. • Any unused error quark functions have been removed. If we later want them we should add them back one by one and add a declaration for them in a header. • _cogl_is_framebuffer has been renamed to cogl_is_framebuffer and made a public function with a declaration in cogl-framebuffer.h • Similarly for CoglOnscreen. • cogl_vdraw_indexed_attributes is called cogl_framebuffer_vdraw_indexed_attributes in the header. The definition has been changed to match the header. • cogl_index_buffer_allocate has been removed. This had no declaration and I'm not sure what it's supposed to do. • CoglJournal has been changed to use the internal CoglObject macro so that it won't define an exported cogl_is_journal symbol. • The _cogl_blah_pointer_from_handle functions have been removed. CoglHandle isn't used much anymore anyway and in the few places where it is used I think it's safe to just use the implicit cast from void* to the right type. • The test-utils.h header for the conformance tests explicitly disables the -Wmissing-declaration option using a pragma because all of the tests declare their main function without a header. Any mistakes relating to missing declarations aren't really important for the tests. • cogl_quaternion_init_from_quaternion and init_from_matrix have been given declarations in cogl-quaternion.h Reviewed-by: Robert Bragg <robert@linux.intel.com>
2012-03-06 18:21:28 +00:00
_cogl_is_journal (void *object);
#endif /* __COGL_JOURNAL_PRIVATE_H */